fix: slug->UUID canonicalisation in paper_flow + repair magic resolver structured-output schemas#78
Open
keola808hunt-dot wants to merge 1 commit into
Conversation
…ed-output schemas
Two openai-agents SDK structured-output failures stopped paper_flow and the
magic NL resolver from running end-to-end. Both stem from QuantMind's rich
Pydantic models meeting the SDK's strict-schema requirements.
paper_flow (the crash):
With strict_json_schema=False the model emits human-readable node-id slugs
("root", "intro") into UUID-typed Paper/TreeNode fields -> 94 uuid_parsing
errors. The domain UUID is load-bearing (tested node-id uniqueness, UUID-keyed
JSON round-trips, stable dedup identity), so rather than weaken it, new
PaperExtraction(Paper) carries a mode="before" validator (canonicalize_tree_ids)
that maps each distinct slug to one UUID and rewrites every id slot (nodes
keys, node_id, parent_id, children_ids, root_node_id, citation anchors),
passing values that are already UUIDs through unchanged. paper_flow defaults
output_type to it; the domain model is untouched.
magic resolver (latent, hidden by mocked-Runner tests):
ResolvedFlowConfig embeds BaseFlowCfg.model_settings: ModelSettings, whose
callable fields cannot be JSON-schema'd (PydanticInvalidForJsonSchema), and the
discriminated-union/knowledge schemas trip strict mode's additionalProperties
guard. Both layers fixed: SkipJsonSchema on model_settings (an execution knob,
never LLM-set) + AgentOutputSchema(..., strict_json_schema=False) on the
resolver output type (same pattern as paper_flow).
Local-file provenance now uses Path.as_posix() for portable, cross-platform
paths (fixes a Windows-only test).
Tests: new test_extraction.py (slug->uuid + uuid passthrough); new
ResolverOutputSchemaTests (real non-mocked schema build, both layers); repaired
test_output_type_override_propagated + test_basemodel_renders_json_schema to the
post-fix contracts; fallback + wrap guards added. Full suite 238/238; ruff +
basedpyright clean. Verified by live-fire: paper_flow on arXiv 2606.05138 (the
exact paper that 94-errored) and the magic resolver via real gpt-4o-mini.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes two
openai-agentsSDK structured-output failures that preventedpaper_flowand themagicnatural-language resolver from running end-to-end. Both stem from QuantMind's rich Pydantic models (UUID-keyed trees, discriminated-union inputs,ModelSettings) meeting the SDK's strict-schema requirements.1.
paper_flow— 94uuid_parsingerrors (the crash)With
strict_json_schema=False, the model emits human-readable node-id slugs ("root","intro","methodology") into theUUID-typedPaper/TreeNodefields → 94uuid_parsingvalidation errors. The slug tree is internally coherent (parent/child/citation refs all consistent) — the model's behaviour is correct for one-shot tree generation; it just collides with the storage type.The domain
UUIDis load-bearing (tested node-id uniqueness,dict[UUID, TreeNode]JSON round-trips, stable dedup identity), so rather than weaken it tostr, this adds a slug-tolerant extraction boundary:quantmind/knowledge/_extraction.py— purecanonicalize_tree_ids()maps each distinct slug to oneUUIDand rewrites every id slot (nodeskeys,node_id,parent_id,children_ids,root_node_id, citation anchors), passing values that are already UUIDs through unchanged.PaperExtraction(Paper)— amodel_validator(mode="before")running the canonicalizer;paper_flowdefaultsoutput_typeto it. The domain model is untouched.2.
magicresolver — un-schemableModelSettings+ strict-mode rejection (latent)ResolvedFlowConfigembedsBaseFlowCfg.model_settings: ModelSettings, whose callable fields cannot be JSON-schema'd (PydanticInvalidForJsonSchema), and the discriminated-union / knowledge schemas trip strict mode'sadditionalPropertiesguard. This was hidden because the existing resolver tests mockRunner.run.Two layers, both required:
SkipJsonSchemaonmodel_settings— it's an execution knob (set programmatically), never LLM-populated, so it is skipped during schema generation. Lets the schema build at all.AgentOutputSchema(..., strict_json_schema=False)on the resolver output type — accepts theadditionalPropertiesthe union/knowledge models emit (same pattern aspaper_flow).3. Portable local-file provenance
_fetch_and_formatrecords local paths viaPath.as_posix()so provenance is consistent cross-platform (fixes a Windows-only test).Tests & verification
tests/knowledge/test_extraction.py(slug→UUID + UUID passthrough);ResolverOutputSchemaTests(real, non-mocked schema build exercising both magic layers); a fallback-coverage test.test_output_type_override_propagatedandtest_basemodel_renders_json_schemato the post-fix contracts; a non-strict wrap-guard assertion on the resolver.ruffandbasedpyrightclean.paper_flowon arXiv2606.05138(the exact paper that 94-errored) builds a validPaper; themagicresolver viagpt-4o-miniparses natural language into a correctArxivIdentifier+ config. The strict-mode layer of the magic fix was caught by the live run — the mocked tests could not surface it.Notes
SkipJsonSchemaships with pydantic); the lockfile is unchanged.NotImplementedError(documented follow-up).🤖 Generated with Claude Code