Skip to content

fix: slug->UUID canonicalisation in paper_flow + repair magic resolver structured-output schemas#78

Open
keola808hunt-dot wants to merge 1 commit into
LLMQuant:masterfrom
keola808hunt-dot:fix/quantmind-paper-extraction-and-magic-schema
Open

fix: slug->UUID canonicalisation in paper_flow + repair magic resolver structured-output schemas#78
keola808hunt-dot wants to merge 1 commit into
LLMQuant:masterfrom
keola808hunt-dot:fix/quantmind-paper-extraction-and-magic-schema

Conversation

@keola808hunt-dot
Copy link
Copy Markdown

Summary

Fixes two openai-agents SDK structured-output failures that prevented paper_flow and the magic natural-language resolver from running end-to-end. Both stem from QuantMind's rich Pydantic models (UUID-keyed trees, discriminated-union inputs, ModelSettings) meeting the SDK's strict-schema requirements.

1. paper_flow — 94 uuid_parsing errors (the crash)

With strict_json_schema=False, the model emits human-readable node-id slugs ("root", "intro", "methodology") into the UUID-typed Paper/TreeNode fields → 94 uuid_parsing validation errors. The slug tree is internally coherent (parent/child/citation refs all consistent) — the model's behaviour is correct for one-shot tree generation; it just collides with the storage type.

The domain UUID is load-bearing (tested node-id uniqueness, dict[UUID, TreeNode] JSON round-trips, stable dedup identity), so rather than weaken it to str, this adds a slug-tolerant extraction boundary:

  • quantmind/knowledge/_extraction.py — pure canonicalize_tree_ids() maps each distinct slug to one UUID and rewrites every id slot (nodes keys, node_id, parent_id, children_ids, root_node_id, citation anchors), passing values that are already UUIDs through unchanged.
  • PaperExtraction(Paper) — a model_validator(mode="before") running the canonicalizer; paper_flow defaults output_type to it. The domain model is untouched.

2. magic resolver — un-schemable ModelSettings + strict-mode rejection (latent)

ResolvedFlowConfig embeds BaseFlowCfg.model_settings: ModelSettings, whose callable fields cannot be JSON-schema'd (PydanticInvalidForJsonSchema), and the discriminated-union / knowledge schemas trip strict mode's additionalProperties guard. This was hidden because the existing resolver tests mock Runner.run.

Two layers, both required:

  • SkipJsonSchema on model_settings — it's an execution knob (set programmatically), never LLM-populated, so it is skipped during schema generation. Lets the schema build at all.
  • AgentOutputSchema(..., strict_json_schema=False) on the resolver output type — accepts the additionalProperties the union/knowledge models emit (same pattern as paper_flow).

3. Portable local-file provenance

_fetch_and_format records local paths via Path.as_posix() so provenance is consistent cross-platform (fixes a Windows-only test).

Tests & verification

  • New: tests/knowledge/test_extraction.py (slug→UUID + UUID passthrough); ResolverOutputSchemaTests (real, non-mocked schema build exercising both magic layers); a fallback-coverage test.
  • Updated: test_output_type_override_propagated and test_basemodel_renders_json_schema to the post-fix contracts; a non-strict wrap-guard assertion on the resolver.
  • Full suite: 238 passing. ruff and basedpyright clean.
  • Live verification (real models, not mocks): paper_flow on arXiv 2606.05138 (the exact paper that 94-errored) builds a valid Paper; the magic resolver via gpt-4o-mini parses natural language into a correct ArxivIdentifier + config. The strict-mode layer of the magic fix was caught by the live run — the mocked tests could not surface it.

Notes

  • No new dependencies (SkipJsonSchema ships with pydantic); the lockfile is unchanged.
  • DOI input remains an intentional NotImplementedError (documented follow-up).

🤖 Generated with Claude Code

…ed-output schemas

Two openai-agents SDK structured-output failures stopped paper_flow and the
magic NL resolver from running end-to-end. Both stem from QuantMind's rich
Pydantic models meeting the SDK's strict-schema requirements.

paper_flow (the crash):
  With strict_json_schema=False the model emits human-readable node-id slugs
  ("root", "intro") into UUID-typed Paper/TreeNode fields -> 94 uuid_parsing
  errors. The domain UUID is load-bearing (tested node-id uniqueness, UUID-keyed
  JSON round-trips, stable dedup identity), so rather than weaken it, new
  PaperExtraction(Paper) carries a mode="before" validator (canonicalize_tree_ids)
  that maps each distinct slug to one UUID and rewrites every id slot (nodes
  keys, node_id, parent_id, children_ids, root_node_id, citation anchors),
  passing values that are already UUIDs through unchanged. paper_flow defaults
  output_type to it; the domain model is untouched.

magic resolver (latent, hidden by mocked-Runner tests):
  ResolvedFlowConfig embeds BaseFlowCfg.model_settings: ModelSettings, whose
  callable fields cannot be JSON-schema'd (PydanticInvalidForJsonSchema), and the
  discriminated-union/knowledge schemas trip strict mode's additionalProperties
  guard. Both layers fixed: SkipJsonSchema on model_settings (an execution knob,
  never LLM-set) + AgentOutputSchema(..., strict_json_schema=False) on the
  resolver output type (same pattern as paper_flow).

Local-file provenance now uses Path.as_posix() for portable, cross-platform
paths (fixes a Windows-only test).

Tests: new test_extraction.py (slug->uuid + uuid passthrough); new
ResolverOutputSchemaTests (real non-mocked schema build, both layers); repaired
test_output_type_override_propagated + test_basemodel_renders_json_schema to the
post-fix contracts; fallback + wrap guards added. Full suite 238/238; ruff +
basedpyright clean. Verified by live-fire: paper_flow on arXiv 2606.05138 (the
exact paper that 94-errored) and the magic resolver via real gpt-4o-mini.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@wanghaoxue0 wanghaoxue0 requested a review from keli-wen June 5, 2026 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant