Skip to content

fix(schema): preserve graph kind field through validateGraph#440

Open
tirth8205 wants to merge 2 commits into
Egonex-AI:mainfrom
tirth8205:fix/schema-preserve-kind
Open

fix(schema): preserve graph kind field through validateGraph#440
tirth8205 wants to merge 2 commits into
Egonex-AI:mainfrom
tirth8205:fix/schema-preserve-kind

Conversation

@tirth8205

Copy link
Copy Markdown
Contributor

Problem

KnowledgeGraphSchema declares an optional kind enum field (z.enum(["codebase", "knowledge"]).optional()) and the KnowledgeGraph type mirrors it (kind?: "codebase" | "knowledge"). However, validateGraph builds its return object from scratch and never copies kind. So a valid input graph with kind: "knowledge" comes back with result.data.kind === undefined — silent data loss on the declared API surface.

This forces consumers into workarounds: the dashboard (App.tsx) reads kind from the raw pre-validation data instead of from result.data, and loadGraph (persistence/index.ts) round-trips a saved knowledge graph and silently strips its kind. Note that version is already preserved in the same constructed object, so omitting kind is an oversight rather than intent.

Fix

Carry the validated kind through in the object validateGraph constructs, gated on it being a valid enum value (mirroring how version is handled):

const graph = {
  version: typeof fixed.version === "string" ? fixed.version : "1.0.0",
  ...(fixed.kind === "codebase" || fixed.kind === "knowledge"
    ? { kind: fixed.kind as "codebase" | "knowledge" }
    : {}),
  project: projectResult.data,
  ...
};

The spread guard ensures an input without kind still yields kind === undefined (no spurious field added).

Testing

Added two tests to schema.test.ts:

  • preserves the kind field for knowledge graphs — fails before the fix (expected undefined to be 'knowledge'), passes after.
  • leaves kind undefined when the input omits it — confirms the guard does not fabricate a kind.

Verification: the new test was confirmed red before the fix and green after. The full core package test suite passes (694 tests), tsc --noEmit exits 0, and ESLint is clean on both changed files.

🤖 Generated with Claude Code

`KnowledgeGraphSchema` declares an optional `kind` enum
("codebase" | "knowledge") and the `KnowledgeGraph` type mirrors it,
but `validateGraph` rebuilds its return object from scratch and never
copied `kind`. A valid input graph with `kind: "knowledge"` therefore
came back with `kind === undefined` — silent data loss on the declared
API surface (loadGraph round-trips strip it, and the dashboard had to
read `kind` from the raw pre-validation data to work around it).

Carry the validated `kind` through, only when it is a valid enum value
(mirroring how `version` is already preserved).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@thejesh23 thejesh23 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. Invalid kind values are silently coerced to undefined with no issue logged.
The guard fixed.kind === "codebase" || fixed.kind === "knowledge" drops anything else (e.g. kind: "unknown", kind: 123, a typo from a hand-edited graph or older language extractor — relevant to extractors like the one in #435) into the no-kind branch without pushing a GraphIssue. Everywhere else in validateGraph an invalid field becomes an auto-corrected or dropped issue; this is the one silent narrowing. Consider pushing an auto-corrected issue when fixed.kind !== undefined but does not match the enum.

2. Consumer workarounds called out in the PR body are not cleaned up.
The description explicitly identifies that App.tsx reads kind from the raw pre-validation data and loadGraph in persistence/index.ts round-trips and strips kind. With this fix landed, those become dead workarounds that the next reader will assume are still load-bearing. Either remove them here or open a follow-up — leaving them silently undoes part of the value of this fix.

3. Test coverage is asymmetric — only "knowledge" and undefined exercised.
No test pins behavior for kind: "codebase" (the other enum branch, presumably the more common production value) or for an invalid kind value, which is exactly the path that motivates concern #1. A three-line it.each over ["codebase","knowledge","bogus",undefined] would lock all four arms of the new branch.

…d data

Concern 1: validateGraph now pushes an auto-corrected GraphIssue when `kind`
is present but not "codebase"/"knowledge", instead of silently narrowing it
to undefined — surfacing typos and stale extractor values like every other
invalid field.

Concern 2: App.tsx reads `result.data.kind` (the validated source of truth)
rather than the raw pre-validation payload now that kind is preserved.

Concern 3: replace the two kind tests with an it.each over
["codebase","knowledge","bogus",undefined] locking all four arms of the
branch, plus issue-emission coverage for the out-of-enum path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tirth8205

Copy link
Copy Markdown
Contributor Author

All three points addressed in 190bf01.

1. Out-of-enum kind silently coerced. validateGraph now pushes an auto-corrected issue (category: "out-of-range", path: "kind") whenever fixed.kind is present but not "codebase"/"knowledge", so a typo or stale extractor value (e.g. from #435) surfaces like every other invalid field instead of being silently narrowed. The value is still dropped from the result (kind is .optional() and a garbage kind is unusable), but it's no longer silent.

2. Consumer workaround. Changed App.tsx to read result.data.kind === "knowledge" from the validated graph rather than the raw pre-validation payload — single source of truth now that the field is preserved. No change was needed in persistence/index.ts: loadGraph already returns validateGraph's result.data and had no independent kind-stripping code; the prior "strip" was just validateGraph dropping the field, which this PR fixes.

3. Asymmetric coverage. Replaced the two kind tests with an it.each over ["codebase","knowledge","bogus","undefined"] asserting result.data.kind equals the input for valid enum values and undefined otherwise, plus a dedicated test locking the auto-corrected/out-of-range issue for the out-of-enum path and a test confirming no kind issue is emitted when kind is omitted.

Full core suite green (698 passed); core and dashboard both typecheck clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants