Skip to content

Compaction trip-wire and Guard 2 skip-check use divergent token estimators #369

Description

@99Yash

Severity: low (latent; likely non-realizable at today's window/threshold, but a real divergence in a v1-placeholder area).

Two token estimates gate compaction in user-authored-brief.ts and they don't agree:

  1. Trip-wire (dispatchToolsStep, ~line 325): estimated = state.lastInputTokens + Math.ceil(tailChars / 4). lastInputTokens is the actually billed input for the prior boss turn — it includes the system prompt + full tool definitions.
  2. Guard 2 skip-check (compactTranscriptStep, ~line 382): skips the compactor when priorChars < COMPACTION_MIN_PRIOR_CHARS (20_000) and estimateTranscriptTokens(transcript) <= pressureThreshold. estimateTranscriptTokens (compaction/tokens.ts) is JSON.stringify(messages).length / 4 — transcript only, excluding system prompt + tool definitions.

So the trip-wire can fire (input over 60% threshold) because of system+tools overhead, then Guard 2 — blind to that overhead — decides the transcript alone fits and skips compaction, routing straight back to boss-turn. The next turn re-trips and re-skips until prior grows past 20K chars, at which point real compaction finally runs.

Why it's probably harmless today: threshold = 60% of min(bossWindow, compactorWindow) ≈ 600K tokens (1M-context models), leaving 40% headroom, and for the trip-wire to fire while prior < 5K tokens the tail would have to be ~595K tokens — at which point Guard 2's transcript-only estimate is also over threshold and it does not skip. So the defer-loop can't actually realize at current magnitudes; worst case is compaction deferred by a turn or two.

Why file it anyway: estimateTranscriptTokens is explicitly a "conservative v1" placeholder (see its doc comment) pending a tokenizer-backed helper, and the threshold %/model windows are config. A smaller-window compactor model, a higher threshold %, or a larger tool-definition surface could each push this into a real over-threshold defer-loop with no headroom. Fix: make Guard 2 use the same system+tools-inclusive estimate as the trip-wire (or fold both into one shared estimator).

Found while stress-testing the compaction path (2026-07-01); happy path + injection resistance validated live (see #138).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions