Compaction trip-wire and Guard 2 skip-check use divergent token estimators

**Severity: low (latent; likely non-realizable at today's window/threshold, but a real divergence in a v1-placeholder area).**

Two token estimates gate compaction in `user-authored-brief.ts` and they don't agree:

1. **Trip-wire** (`dispatchToolsStep`, ~line 325): `estimated = state.lastInputTokens + Math.ceil(tailChars / 4)`. `lastInputTokens` is the *actually billed* input for the prior boss turn — it **includes** the system prompt + full tool definitions.
2. **Guard 2 skip-check** (`compactTranscriptStep`, ~line 382): skips the compactor when `priorChars < COMPACTION_MIN_PRIOR_CHARS (20_000)` **and** `estimateTranscriptTokens(transcript) <= pressureThreshold`. `estimateTranscriptTokens` (compaction/tokens.ts) is `JSON.stringify(messages).length / 4` — transcript **only**, **excluding** system prompt + tool definitions.

So the trip-wire can fire (input over 60% threshold) *because of* system+tools overhead, then Guard 2 — blind to that overhead — decides the transcript alone fits and skips compaction, routing straight back to `boss-turn`. The next turn re-trips and re-skips until `prior` grows past 20K chars, at which point real compaction finally runs.

**Why it's probably harmless today:** threshold = 60% of `min(bossWindow, compactorWindow)` ≈ 600K tokens (1M-context models), leaving 40% headroom, and for the trip-wire to fire while `prior < 5K tokens` the tail would have to be ~595K tokens — at which point Guard 2's transcript-only estimate is *also* over threshold and it does **not** skip. So the defer-loop can't actually realize at current magnitudes; worst case is compaction deferred by a turn or two.

**Why file it anyway:** `estimateTranscriptTokens` is explicitly a "conservative v1" placeholder (see its doc comment) pending a tokenizer-backed helper, and the threshold %/model windows are config. A smaller-window compactor model, a higher threshold %, or a larger tool-definition surface could each push this into a real over-threshold defer-loop with no headroom. Fix: make Guard 2 use the same system+tools-inclusive estimate as the trip-wire (or fold both into one shared estimator).

Found while stress-testing the compaction path (2026-07-01); happy path + injection resistance validated live (see #138).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compaction trip-wire and Guard 2 skip-check use divergent token estimators #369

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Compaction trip-wire and Guard 2 skip-check use divergent token estimators #369

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions