Skip to content

FD-337: windowed re-extraction + cross-session parent back-fill#124

Merged
patrickkidd merged 3 commits into
masterfrom
FD-337
Jun 3, 2026
Merged

FD-337: windowed re-extraction + cross-session parent back-fill#124
patrickkidd merged 3 commits into
masterfrom
FD-337

Conversation

@patrickkidd
Copy link
Copy Markdown
Owner

@patrickkidd patrickkidd commented Jun 2, 2026

FD-337: Re-extraction for long/multi-session conversations

Corrected diagnosis (the ticket's "stuck at 25-30%" was a measurement bug)

The from-empty connectivity_check --accumulate honored each discussion's persisted re-extraction cursor. Diagram 1924's discussions carry a stale "accepted through order 200" cursor, so the from-empty accumulate extracted an empty tail and the model produced a degenerate fragment → the ~25-30% figure. It was never the extraction that was stuck; it was the measurement.

The real defect is still real: the actual stored diagram 1924 is 32 people in 14 components — 56% connected. Production's incremental extraction (with the Pass-3 crash below live) left it fragmented.

What this PR does

  • Crash fix (llmutil.gemini_structured) — accepts optional model=. A pre-existing regression (FD-333, c8d5c00) crashed every extraction reaching Pass 3 (SARF review) with TypeError. ⚠️ If master is deployed, production chat extraction has been crashing since May 21 — please confirm deploy state.
  • Windowing (pdp.extract_full) — conversations > WINDOW_SIZE (default 80, env FD_WINDOW_SIZE) are extracted in windows on a clone, each window committed so later windows see it as context, then re-staged as negative-id deltas. Short spans keep the single-shot path byte-for-byte. Contract preserved — never commits into the caller's diagram (verified by test).
  • Cross-session parent back-fill (schema.commit_pdp_items) — sets parents on an already-committed child when a later session emits a birth event naming its couple.
  • Measurement fidelityconnectivity_check/run_extract_full_f1 null the persisted cursor for from-empty measurement; F1 runner now prints PairBonds + ParentChild precision.
  • Tests — back-fill (+ no-overwrite guard), windowing contract, ID-remap reconstruction, model-override regression guard. 264 passing in the changed areas; CI green.

Honest attribution (accumulate 55,58,60, real implementation)

config LCC people note
stale cursor (the old "baseline") ~35% ~18 degenerate measurement
cursor fix only, single-shot ~81% median ~22 dominant lever; drops ~10 family members
+ back-fill ~82% ~22 back-fill ≈ +1%
+ windowing (the deliverable) 80.0% median (15 runs) ~31 full family; see distribution below

Windowing earns its place on completeness: single-shot recovers only ~22 of 32 people; windowing recovers ~31. LCC% is a ratio that hides this — the denominator grows with completeness.

Acceptance status — borderline, stated plainly

15-run windowed distribution: 53, 69, 75, 77, 77, 78, 80, 80, 81, 83, 83, 84, 84, 84, 90.

  • median 80.0% (right at the AC line), mean 78.6%, 9/15 ≥80%, floor tail 53-69%.
  • Full family recovered (~31 of 32 people) — vs the stored diagram's fragmented 56%.

So the AC's ≥80% is met on the median only; the mean is just under and the floor is not uniformly ≥80%.

Floor limitation (why it isn't higher) — and the ceiling

The sub-80 runs drop a real family member's bridge — e.g. the proband's sibling's spouse (Vance+Meredith) extracted as a free-floating couple, or Jim+Sheila's family correctly internally-linked but not joined to the proband. These are missing relationship links the model omits run-to-run. The obvious levers are ruled out by the ticket: proband-linking prompt directives (tried, caused dangling-reference crashes), name-matching dedup (forbidden), extra structural-completion passes (tried, regressed F1).

The bridges exist — they're variance-distributed. A 6-run consensus experiment (merge the union of repeated windowed extractions) reaches 94.9% connectivity (plain LCC) vs a single-run median of ~84% — each run misses a different subset of bridges, and the union recovers nearly all of them. So the floor is achievable via multi-sample consensus at ~K× extraction cost — viable as an async "deep re-extraction," not real-time. That's a product decision (is K× latency acceptable for a high-fidelity re-extraction mode?), so it's not in this PR. This confirms the floor is a model-variance limitation, not a code gap.

No F1 regression

Aggregate ~0.65, ParentChild F1 ~0.80 (precision reported), People ~0.93. On baseline.

Companion: fdserver#24.

🤖 Generated with Claude Code

Long/multi-session chat extraction recovered only ~half the family tree
(single 200+ statement LLM pass drops people and parent links). Two
mechanisms, both measured:

- llmutil.gemini_structured: accept an optional `model` override. Fixes a
  pre-existing FD-333 regression (c8d5c00) that passed model= to a function
  without the param, crashing EVERY extraction that reached Pass 3 (SARF
  review) with TypeError.
- pdp.extract_full: window long conversations (>WINDOW_SIZE statements) on a
  clone of the diagram, committing each window so later windows see earlier
  results as committed context, then re-stage the combined new items as
  negative-id deltas. Short spans keep the existing single-shot path
  byte-for-byte. Contract preserved: never commits into the caller's diagram.
  Window size via FD_WINDOW_SIZE (default 80).
- schema.commit_pdp_items: deterministic cross-session parent back-fill —
  set parents on an already-committed child when a later session emits a
  birth event naming its couple (the delta-only repair could never reach a
  committed child).
- connectivity_check --accumulate / run_extract_full_f1: null each
  discussion's persisted cursor for from-empty measurement (a stale
  "accepted through order N" cursor made the from-empty accumulate extract an
  empty tail). run_extract_full_f1 also now prints PairBonds + ParentChild
  precision.

Measured (diagram 1924, accumulate 55,58,60): single-shot median ~35% LCC →
windowed+backfill median ~84%, recovering ~31 of 32 people. No F1 regression.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces windowed extraction in extract_full to handle large discussions by chunking statements into windows, along with a cross-session parent backfill mechanism to link committed children to their parents when birth or adoption events are processed later. It also updates evaluation scripts to reset extraction cursors and track precision and recall metrics. The reviewer feedback suggests separating the ID remapping dictionary by entity type in _restage_new_items to avoid ID overlaps, adding an early return in extract_full when there are no statements to extract to save LLM tokens, and adding defensive checks for the presence of the "id" key in dictionary items to prevent potential KeyErrors.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread btcopilot/pdp.py
Comment on lines +1209 to +1257
def _restage_new_items(working: DiagramData, original: DiagramData) -> PDP:
"""Re-stage items committed across windows that are absent from the original
diagram as a fresh negative-id PDP. References into original committed items
stay positive; references among the new items are remapped to negatives."""
orig_people = {p["id"] for p in original.people}
orig_events = {e["id"] for e in original.events}
orig_bonds = {pb["id"] for pb in original.pair_bonds}

new_people = [p for p in working.people if p["id"] not in orig_people]
new_events = [e for e in working.events if e["id"] not in orig_events]
new_bonds = [pb for pb in working.pair_bonds if pb["id"] not in orig_bonds]

next_id = -1
id_map: dict[int, int] = {}
for item in [*new_people, *new_bonds, *new_events]:
id_map[item["id"]] = next_id
next_id -= 1

def remap(old):
return id_map.get(old, old) if old is not None else None

people, bonds, events = [], [], []
for p in new_people:
d = dict(p)
d["id"] = id_map[p["id"]]
d["parents"] = remap(p.get("parents"))
people.append(from_dict(Person, d))
for pb in new_bonds:
d = dict(pb)
d["id"] = id_map[pb["id"]]
d["person_a"] = remap(pb.get("person_a"))
d["person_b"] = remap(pb.get("person_b"))
bonds.append(from_dict(PairBond, d))
for e in new_events:
d = dict(e)
d["id"] = id_map[e["id"]]
for k in ("person", "spouse", "child"):
d[k] = remap(d.get(k))
d["relationshipTargets"] = [remap(t) for t in d.get("relationshipTargets") or []]
d["relationshipTriangles"] = [
remap(t) for t in d.get("relationshipTriangles") or []
]
for k in ("dateTime", "endDateTime"):
v = d.get(k)
if v is not None and hasattr(v, "toString"):
d[k] = v.toString("yyyy-MM-dd")
events.append(from_dict(Event, d))

return PDP(people=people, events=events, pair_bonds=bonds)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

When remapping entity IDs, ensure the remapping dictionary corresponds to the correct entity type. Using a single id_map for all entity types (People, PairBonds, Events) can lead to incorrect remapping if there are any ID overlaps or if the ID generation strategy changes in the future. Additionally, add defensive checks to ensure the "id" key exists in the dictionary items before accessing it to prevent potential KeyErrors.

def _restage_new_items(working: DiagramData, original: DiagramData) -> PDP:
    """Re-stage items committed across windows that are absent from the original
    diagram as a fresh negative-id PDP. References into original committed items
    stay positive; references among the new items are remapped to negatives."""
    orig_people = {p["id"] for p in original.people if "id" in p}
    orig_events = {e["id"] for e in original.events if "id" in e}
    orig_bonds = {pb["id"] for pb in original.pair_bonds if "id" in pb}

    new_people = [p for p in working.people if "id" in p and p["id"] not in orig_people]
    new_events = [e for e in working.events if "id" in e and e["id"] not in orig_events]
    new_bonds = [pb for pb in working.pair_bonds if "id" in pb and pb["id"] not in orig_bonds]

    person_id_map: dict[int, int] = {}
    bond_id_map: dict[int, int] = {}
    event_id_map: dict[int, int] = {}

    next_id = -1
    for p in new_people:
        person_id_map[p["id"]] = next_id
        next_id -= 1
    for pb in new_bonds:
        bond_id_map[pb["id"]] = next_id
        next_id -= 1
    for e in new_events:
        event_id_map[e["id"]] = next_id
        next_id -= 1

    people, bonds, events = [], [], []
    for p in new_people:
        d = dict(p)
        d["id"] = person_id_map[p["id"]]
        d["parents"] = bond_id_map.get(p.get("parents"), p.get("parents")) if p.get("parents") is not None else None
        people.append(from_dict(Person, d))
    for pb in new_bonds:
        d = dict(pb)
        d["id"] = bond_id_map[pb["id"]]
        d["person_a"] = person_id_map.get(pb.get("person_a"), pb.get("person_a")) if pb.get("person_a") is not None else None
        d["person_b"] = person_id_map.get(pb.get("person_b"), pb.get("person_b")) if pb.get("person_b") is not None else None
        bonds.append(from_dict(PairBond, d))
    for e in new_events:
        d = dict(e)
        d["id"] = event_id_map[e["id"]]
        for k in ("person", "spouse", "child"):
            val = d.get(k)
            if val is not None:
                d[k] = person_id_map.get(val, val)
        d["relationshipTargets"] = [person_id_map.get(t, t) for t in d.get("relationshipTargets") or []]
        d["relationshipTriangles"] = [
            person_id_map.get(t, t) for t in d.get("relationshipTriangles") or []
        ]
        for k in ("dateTime", "endDateTime"):
            v = d.get(k)
            if v is not None and hasattr(v, "toString"):
                d[k] = v.toString("yyyy-MM-dd")
        events.append(from_dict(Event, d))

    return PDP(people=people, events=events, pair_bonds=bonds)
References
  1. When remapping entity IDs, ensure the remapping dictionary corresponds to the correct entity type. For instance, do not remap a PairBond ID using a Person ID map.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the per-type maps in 2d900b2 — matches the existing reassign_delta_ids convention in this file. For the record it wasn't a correctness bug: commit_pdp_items assigns globally-unique ids from a single counter, so the combined map had no overlaps. Declining the defensive id guards — committed items always carry an id, and per our fail-early policy a missing one should raise, not be silently skipped.

Comment thread btcopilot/pdp.py
to_extract = [s for s in ordered if (s.order or 0) > cursor]
prior_committed = [s for s in ordered if (s.order or 0) <= cursor]

if len(to_extract) <= WINDOW_SIZE:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If to_extract is empty (meaning all statements in the discussion have already been extracted and committed), we can return early with empty PDP and PDPDeltas objects. This avoids unnecessary LLM calls, saving API tokens and execution time.

    if not to_extract:
        diagram_data.pdp = PDP()
        return PDP(), PDPDeltas()

    if len(to_extract) <= WINDOW_SIZE:

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in 2d900b2. It also avoids the degenerate empty-tail extraction (the model emitting spurious items when nothing is past the cursor), so good catch.

Comment thread btcopilot/schema.py
Comment on lines +708 to +740
def _backfill_committed_parents(self) -> None:
"""Cross-session parent back-fill: when a birth/adopted event names a
couple as the parents of an already-committed child whose `parents` is
still null, set it. The committed-data analogue of
pdp.infer_parents_from_birth_events, which only reaches delta people —
so a child committed in an earlier session is otherwise unreachable when
a later session states its parentage."""
bond_by_dyad: dict[tuple[int, int], int] = {}
for pb in self.pair_bonds:
a, b = pb.get("person_a"), pb.get("person_b")
if a is not None and b is not None:
bond_by_dyad[tuple(sorted((a, b)))] = pb["id"]
people_by_id = {p["id"]: p for p in self.people}
for e in self.events:
kind = e.get("kind")
kind = kind.value if isinstance(kind, EventKind) else kind
if kind not in (EventKind.Birth.value, EventKind.Adopted.value):
continue
person, spouse, child = e.get("person"), e.get("spouse"), e.get("child")
if person is None or spouse is None or child is None:
continue
bond_id = bond_by_dyad.get(tuple(sorted((person, spouse))))
if bond_id is None:
continue
ch = people_by_id.get(child)
if ch is None or ch.get("parents") is not None:
continue
ch["parents"] = bond_id
_log.info(
f"backfill_committed_parents: committed Person {child} "
f"parents={bond_id} from birth event {e.get('id')}"
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Add defensive checks to ensure the "id" key exists in the dictionary items before accessing it. This prevents potential KeyErrors if any person, event, or pair bond in the diagram data is malformed or missing an ID.

    def _backfill_committed_parents(self) -> None:
        """Cross-session parent back-fill: when a birth/adopted event names a
        couple as the parents of an already-committed child whose `parents` is
        still null, set it. The committed-data analogue of
        pdp.infer_parents_from_birth_events, which only reaches delta people —
        so a child committed in an earlier session is otherwise unreachable when
        a later session states its parentage."""
        bond_by_dyad: dict[tuple[int, int], int] = {}
        for pb in self.pair_bonds:
            a, b = pb.get("person_a"), pb.get("person_b")
            if a is not None and b is not None and "id" in pb:
                bond_by_dyad[tuple(sorted((a, b)))] = pb["id"]
        people_by_id = {p["id"]: p for p in self.people if "id" in p}
        for e in self.events:
            kind = e.get("kind")
            kind = kind.value if isinstance(kind, EventKind) else kind
            if kind not in (EventKind.Birth.value, EventKind.Adopted.value):
                continue
            person, spouse, child = e.get("person"), e.get("spouse"), e.get("child")
            if person is None or spouse is None or child is None:
                continue
            bond_id = bond_by_dyad.get(tuple(sorted((person, spouse))))
            if bond_id is None:
                continue
            ch = people_by_id.get(child)
            if ch is None or ch.get("parents") is not None:
                continue
            ch["parents"] = bond_id
            _log.info(
                f"backfill_committed_parents: committed Person {child} "
                f"parents={bond_id} from birth event {e.get('id')}"
            )

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declining: committed people/events/pair_bonds always have an id assigned by commit_pdp_items. A committed item missing one is an invariant violation elsewhere; per our fail-early / no-bandaid policy we let it raise rather than silently skip.

patrickkidd and others added 2 commits June 2, 2026 17:35
…y return on empty tail

- _restage_new_items: separate person/bond/event id maps (matches
  reassign_delta_ids convention; removes coupling to the global-unique-id
  invariant)
- extract_full: early return when nothing is past the cursor, avoiding a
  wasted LLM call and the degenerate empty-tail extraction

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ride

- test_commit: committed-child parent back-fill + no-overwrite guard
- test_extract_full: windowing runs per-window and preserves the caller's
  committed diagram; _restage_new_items keeps committed refs positive and
  remaps new refs negative; gemini_structured accepts model= (FD-333 crash guard)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@patrickkidd patrickkidd merged commit cec10ff into master Jun 3, 2026
2 checks passed
@patrickkidd patrickkidd deleted the FD-337 branch June 3, 2026 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant