Skip to content

fix(viewer): skip unreadable/non-UTF-8 files instead of aborting the whole bundle#122

Open
mftnakrsu wants to merge 1 commit into
GoogleCloudPlatform:mainfrom
mftnakrsu:fix/viewer-skip-unreadable-bundle-files
Open

fix(viewer): skip unreadable/non-UTF-8 files instead of aborting the whole bundle#122
mftnakrsu wants to merge 1 commit into
GoogleCloudPlatform:mainfrom
mftnakrsu:fix/viewer-skip-unreadable-bundle-files

Conversation

@mftnakrsu

@mftnakrsu mftnakrsu commented Jun 20, 2026

Copy link
Copy Markdown

What

enrichment_agent visualize aborts the entire bundle if a single concept file can't be read, decoded, or parsed.

Why

In viewer/generator.py, _walk_concepts reads every concept with read_text(encoding="utf-8") and parses it, but only caught OKFDocumentError. Three realistic failure modes still escaped and killed the whole visualize command:

  • invalid UTF-8 bytes → UnicodeDecodeError
  • an unreadable file → OSError
  • a malformed frontmatter value — e.g. an out-of-range timestamp: 2026-13-45 — makes PyYAML's implicit resolver raise a bare ValueError that OKFDocument.parse does not wrap

One bad file takes down the whole graph. The sibling bundle/index.py::_load_doc already tolerates all of these (except Exception: return None).

Change

  • Skip the offending file by catching (ValueError, OSError). ValueError subsumes OKFDocumentError and UnicodeDecodeError (both subclasses) as well as PyYAML's bare ValueError; OSError covers read errors. The now-unused OKFDocumentError import is dropped.
  • Add two regression tests: test_unreadable_concept_file_is_skipped (invalid UTF-8) and test_malformed_timestamp_concept_file_is_skipped (bad timestamp → bare ValueError), each asserting the remaining concepts still render.

Tests: pytest okf/tests/test_viewer.py → 8 passed.

🤖 Generated with Claude Code

@mftnakrsu mftnakrsu marked this pull request as ready for review June 20, 2026 20:38
…whole bundle

_walk_concepts caught only OKFDocumentError, so a single file with invalid
UTF-8 bytes, a read error, or a bare ValueError from PyYAML (e.g. an
out-of-range timestamp) aborted the entire visualize run. Broaden the
handler to (ValueError, OSError) and skip the offending file, matching the
tolerance bundle.index._load_doc already applies. Adds tests for the
invalid-UTF-8 and bad-timestamp cases.
@mftnakrsu mftnakrsu force-pushed the fix/viewer-skip-unreadable-bundle-files branch from cf975be to c97c1ae Compare June 21, 2026 12:10
@mftnakrsu

Copy link
Copy Markdown
Author

@amirhormati a small viewer robustness fix — skip unreadable/non-UTF-8 files instead of aborting the whole visualize run. Rebased onto the latest main; checks are green and it's mergeable. Would appreciate a review when you get a chance. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant