feat(critic): Definition-of-Done gate chain + Finalize Critic (M3.5 /… by wusijian007 · Pull Request #21 · wusijian007/mini-claude-code

wusijian007 · 2026-06-17T01:04:44Z

… §4)

Generalize the single structural-verify gate (M3.3) into a DoneGate chain run at the completion path: when the model stops calling tools, each gate must pass; the first failure injects a reflective user turn and the loop continues, bounded by ONE shared bounce budget. With a single gate this is byte-for-byte the old verify behavior (zero behavior change; M3.3 tests stay green).

Add a second gate, the Finalize Critic: a read-only model call (no tools given, so read-only by construction; core cannot import the Agent tool, so a tool-less critic call is the layer-safe realization of the read-only verifier idea) that judges whether the final answer satisfies the root task. APPROVE completes; REJECT injects a reflective revise turn. It runs in its own child context (a single synthesized message), so it does not pollute the parent prefix until the reflection is appended (invariant #3). Gate order is verify-then-critic (fail-fast: no model call if it does not even compile).

core: QueryOptions.critic (CriticConfig), CriticEvent, gate-chain runner, createVerifyGate/createCriticGate, runCritic/parseCriticVerdict.
cli: myagent agent --critic [--critic-instructions ""], [critic] event printing, help/usage text.
eval: new finalize-critic task (reject -> revise -> approve) on a separate scripted critic FakeModel; gate fingerprint updated to tasks=9 turns=19 in=13150 out=765.
docs: CLAUDE.md query-loop step 8 rewritten as the gate chain; roadmap M3.5 section + decisions anchored.

210 tests pass (+6 critic query tests).

… §4) Generalize the single structural-verify gate (M3.3) into a DoneGate chain run at the completion path: when the model stops calling tools, each gate must pass; the first failure injects a reflective user turn and the loop continues, bounded by ONE shared bounce budget. With a single gate this is byte-for-byte the old verify behavior (zero behavior change; M3.3 tests stay green). Add a second gate, the Finalize Critic: a read-only model call (no tools given, so read-only by construction; core cannot import the Agent tool, so a tool-less critic call is the layer-safe realization of the read-only verifier idea) that judges whether the final answer satisfies the root task. APPROVE completes; REJECT injects a reflective revise turn. It runs in its own child context (a single synthesized message), so it does not pollute the parent prefix until the reflection is appended (invariant #3). Gate order is verify-then-critic (fail-fast: no model call if it does not even compile). - core: QueryOptions.critic (CriticConfig), CriticEvent, gate-chain runner, createVerifyGate/createCriticGate, runCritic/parseCriticVerdict. - cli: myagent agent --critic [--critic-instructions "<text>"], [critic] event printing, help/usage text. - eval: new finalize-critic task (reject -> revise -> approve) on a separate scripted critic FakeModel; gate fingerprint updated to tasks=9 turns=19 in=13150 out=765. - docs: CLAUDE.md query-loop step 8 rewritten as the gate chain; roadmap M3.5 section + decisions anchored. 210 tests pass (+6 critic query tests). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

wusijian007 merged commit f1080fa into main Jun 17, 2026
3 checks passed

wusijian007 deleted the feat/m3.5-finalize-critic branch June 17, 2026 01:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(critic): Definition-of-Done gate chain + Finalize Critic (M3.5 /…#21

feat(critic): Definition-of-Done gate chain + Finalize Critic (M3.5 /…#21
wusijian007 merged 1 commit into
mainfrom
feat/m3.5-finalize-critic

wusijian007 commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wusijian007 commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant