Skip to content

Quality: yaml.safe_load returns None for empty YAML files, propagating as None to callers expecting dict#1498

Open
kumburovicbranko682-boop wants to merge 1 commit into
NVIDIA-NeMo:mainfrom
kumburovicbranko682-boop:contribai/improve/quality/yaml-safe-load-returns-none-for-empty-ya
Open

Quality: yaml.safe_load returns None for empty YAML files, propagating as None to callers expecting dict#1498
kumburovicbranko682-boop wants to merge 1 commit into
NVIDIA-NeMo:mainfrom
kumburovicbranko682-boop:contribai/improve/quality/yaml-safe-load-returns-none-for-empty-ya

Conversation

@kumburovicbranko682-boop

@kumburovicbranko682-boop kumburovicbranko682-boop commented Jun 27, 2026

Copy link
Copy Markdown

✨ Code Quality

Problem

yaml.safe_load() returns None when given an empty YAML file (or one with only comments). load_config returns this None directly, and get_eval_group passes it through unchanged. Any caller doing config["key"] will get TypeError: 'NoneType' object is not subscriptable. This is a silent misconfiguration bug — an empty or mis-rotated config file produces no useful error message, just a cryptic downstream crash.

Severity: high
File: nemo_skills/evaluation/utils.py

Solution

In load_config, after yaml.safe_load, validate the return value:

Changes

  • nemo_skills/evaluation/utils.py (modified)

Testing

  • Existing tests pass
  • Manual review completed
  • No new warnings/errors introduced


🤖 About this PR

This pull request was generated by ContribAI, an AI agent
that helps improve open source projects. The change was:

  1. Discovered by automated code analysis
  2. Generated by AI with context-aware code generation
  3. Self-reviewed by AI quality checks

If you have questions or feedback about this PR, please comment below.
We appreciate your time reviewing this contribution!

Closes #1497

Summary by CodeRabbit

  • Bug Fixes
    • Improved configuration loading to handle empty or comment-only YAML files more safely.
    • Added a clear error when a config file has no usable content, making invalid configurations easier to identify.

…ng as none to callers expecting dict

`yaml.safe_load()` returns `None` when given an empty YAML file (or one with only comments). `load_config` returns this `None` directly, and `get_eval_group` passes it through unchanged. Any caller doing `config["key"]` will get `TypeError: 'NoneType' object is not subscriptable`. This is a silent misconfiguration bug — an empty or mis-rotated config file produces no useful error message, just a cryptic downstream crash.


Affected files: utils.py

Signed-off-by: kumburovicbranko682-boop <295886834+kumburovicbranko682-boop@users.noreply.github.com>
@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

load_config now captures the parsed YAML result, raises ValueError for empty or comment-only files, and no longer returns None to callers.

Changes

Empty YAML config handling

Layer / File(s) Summary
None guard in config loading
nemo_skills/evaluation/utils.py
load_config now stores yaml.safe_load(fin) in config_data and raises ValueError when the parsed YAML is None.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the YAML empty-file None handling fix in nemo_skills/evaluation/utils.py.
Linked Issues check ✅ Passed The change prevents load_config from returning None on empty YAML inputs, matching the issue's expected behavior.
Out of Scope Changes check ✅ Passed The diff is narrowly focused on the reported YAML None handling bug and adds no unrelated behavior.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@nemo_skills/evaluation/utils.py`:
- Around line 45-48: load_config currently only rejects empty YAML, but it can
still վերադարձ non-dict roots like lists or scalars even though it is documented
and typed to return a dict. Update the validation right after
yaml.safe_load(fin) in load_config to ensure config_data is a mapping/dict
before returning it, and raise a ValueError for any non-mapping root so callers
always receive a dict.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 25c5d25a-9e48-48d3-a9a9-a4d03e5e1426

📥 Commits

Reviewing files that changed from the base of the PR and between 367b655 and b71aa9a.

📒 Files selected for processing (1)
  • nemo_skills/evaluation/utils.py

Comment on lines +45 to +48
config_data = yaml.safe_load(fin)
if config_data is None:
raise ValueError(f"Config file {config_path} is empty or contains only comments.")
return config_data

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Reject non-mapping YAML roots here too.

load_config() is typed and documented to return a dict, but this only rejects None. A YAML list/scalar still gets returned and will fail later in less obvious ways when callers index config keys. Please validate the root type before returning.

Proposed fix
     with open(config_path, "rt", encoding="utf-8") as fin:
         config_data = yaml.safe_load(fin)
     if config_data is None:
         raise ValueError(f"Config file {config_path} is empty or contains only comments.")
+    if not isinstance(config_data, dict):
+        raise ValueError(f"Config file {config_path} must contain a YAML mapping at the top level.")
     return config_data
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
config_data = yaml.safe_load(fin)
if config_data is None:
raise ValueError(f"Config file {config_path} is empty or contains only comments.")
return config_data
config_data = yaml.safe_load(fin)
if config_data is None:
raise ValueError(f"Config file {config_path} is empty or contains only comments.")
if not isinstance(config_data, dict):
raise ValueError(f"Config file {config_path} must contain a YAML mapping at the top level.")
return config_data
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@nemo_skills/evaluation/utils.py` around lines 45 - 48, load_config currently
only rejects empty YAML, but it can still վերադարձ non-dict roots like lists or
scalars even though it is documented and typed to return a dict. Update the
validation right after yaml.safe_load(fin) in load_config to ensure config_data
is a mapping/dict before returning it, and raise a ValueError for any
non-mapping root so callers always receive a dict.

Source: Coding guidelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: yaml.safe_load returns none for empty yaml files, propagating as none to callers expecting dict

1 participant