Quality: yaml.safe_load returns None for empty YAML files, propagating as None to callers expecting dict#1498
Conversation
…ng as none to callers expecting dict `yaml.safe_load()` returns `None` when given an empty YAML file (or one with only comments). `load_config` returns this `None` directly, and `get_eval_group` passes it through unchanged. Any caller doing `config["key"]` will get `TypeError: 'NoneType' object is not subscriptable`. This is a silent misconfiguration bug — an empty or mis-rotated config file produces no useful error message, just a cryptic downstream crash. Affected files: utils.py Signed-off-by: kumburovicbranko682-boop <295886834+kumburovicbranko682-boop@users.noreply.github.com>
📝 WalkthroughWalkthrough
ChangesEmpty YAML config handling
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@nemo_skills/evaluation/utils.py`:
- Around line 45-48: load_config currently only rejects empty YAML, but it can
still վերադարձ non-dict roots like lists or scalars even though it is documented
and typed to return a dict. Update the validation right after
yaml.safe_load(fin) in load_config to ensure config_data is a mapping/dict
before returning it, and raise a ValueError for any non-mapping root so callers
always receive a dict.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 25c5d25a-9e48-48d3-a9a9-a4d03e5e1426
📒 Files selected for processing (1)
nemo_skills/evaluation/utils.py
| config_data = yaml.safe_load(fin) | ||
| if config_data is None: | ||
| raise ValueError(f"Config file {config_path} is empty or contains only comments.") | ||
| return config_data |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Reject non-mapping YAML roots here too.
load_config() is typed and documented to return a dict, but this only rejects None. A YAML list/scalar still gets returned and will fail later in less obvious ways when callers index config keys. Please validate the root type before returning.
Proposed fix
with open(config_path, "rt", encoding="utf-8") as fin:
config_data = yaml.safe_load(fin)
if config_data is None:
raise ValueError(f"Config file {config_path} is empty or contains only comments.")
+ if not isinstance(config_data, dict):
+ raise ValueError(f"Config file {config_path} must contain a YAML mapping at the top level.")
return config_data📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| config_data = yaml.safe_load(fin) | |
| if config_data is None: | |
| raise ValueError(f"Config file {config_path} is empty or contains only comments.") | |
| return config_data | |
| config_data = yaml.safe_load(fin) | |
| if config_data is None: | |
| raise ValueError(f"Config file {config_path} is empty or contains only comments.") | |
| if not isinstance(config_data, dict): | |
| raise ValueError(f"Config file {config_path} must contain a YAML mapping at the top level.") | |
| return config_data |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@nemo_skills/evaluation/utils.py` around lines 45 - 48, load_config currently
only rejects empty YAML, but it can still վերադարձ non-dict roots like lists or
scalars even though it is documented and typed to return a dict. Update the
validation right after yaml.safe_load(fin) in load_config to ensure config_data
is a mapping/dict before returning it, and raise a ValueError for any
non-mapping root so callers always receive a dict.
Source: Coding guidelines
✨ Code Quality
Problem
yaml.safe_load()returnsNonewhen given an empty YAML file (or one with only comments).load_configreturns thisNonedirectly, andget_eval_grouppasses it through unchanged. Any caller doingconfig["key"]will getTypeError: 'NoneType' object is not subscriptable. This is a silent misconfiguration bug — an empty or mis-rotated config file produces no useful error message, just a cryptic downstream crash.Severity:
highFile:
nemo_skills/evaluation/utils.pySolution
In
load_config, afteryaml.safe_load, validate the return value:Changes
nemo_skills/evaluation/utils.py(modified)Testing
🤖 About this PR
This pull request was generated by ContribAI, an AI agent
that helps improve open source projects. The change was:
If you have questions or feedback about this PR, please comment below.
We appreciate your time reviewing this contribution!
Closes #1497
Summary by CodeRabbit