feat(verify): structural verification gate with edit->test->fix loop … by wusijian007 · Pull Request #19 · wusijian007/mini-claude-code

wusijian007 · 2026-06-16T13:45:32Z

…(M3.3 / §4)

Third v3 milestone (§4 Self-Correction). Upgrades verification from model-opt-in (the verifier sub-agent, which the model may or may not spawn) to a structural loop gate. Includes the written §4 design section in docs/v3-kernel-roadmap.md (design-before-code per the roadmap's blast-radius tiering).

M3.3a -- verification gate (query.ts):
QueryOptions.verify = { command, args, when?, maxBounces? }. At the
completion path (the model emits no tool_uses -- "I'm done"), the loop
no longer returns immediately. It runs the verify command via
ToolContext.executor (the M2.1 seam, NOT the whitelisted Bash tool --
arbitrary npm test / tsc --noEmit is allowed). Exit 0 -> completes.
Non-zero -> the failure is injected as a reflective user turn
(reflectiveVerifyFailure: command + exit + truncated output + "locate
and fix; I'll re-run") and the loop continues an edit->test->fix cycle.
Bounded by maxBounces (default 2); exceeding it (or running out of
turns) ends with a verification_failed terminal state rather than a
silent completed. Each check yields a verification LoopEvent;
bounces also emit a query.verify_bounce profile mark.

M3.3b -- CLI + eval:
myagent agent --verify "<command>" parses into QueryOptions.verify
(whitespace split). The CLI prints a [verify] line per check. A 7th
eval task "self-correction" drives edit -> verify-fail -> fix ->
verify-pass through an injected scripted mock executor (exit codes
[1, 0]) -- fully deterministic, offline. The eval gate fingerprint
updated (tasks 7, turns 15, in 10800, out 625).

Determinism (invariant #2): the gate runs through ToolContext.executor, so the eval + the 4 new query-loop tests inject a mock CommandExecutor instead of spawning real processes. New TerminalState status "verification_failed" and a VerificationEvent added to the LoopEvent union.

finalize critic pass stays deferred to a §4 follow-up.

Local: 201 tests, 3/3 green.

…(M3.3 / §4) Third v3 milestone (§4 Self-Correction). Upgrades verification from model-opt-in (the verifier sub-agent, which the model may or may not spawn) to a structural loop gate. Includes the written §4 design section in docs/v3-kernel-roadmap.md (design-before-code per the roadmap's blast-radius tiering). M3.3a -- verification gate (query.ts): QueryOptions.verify = { command, args, when?, maxBounces? }. At the completion path (the model emits no tool_uses -- "I'm done"), the loop no longer returns immediately. It runs the verify command via ToolContext.executor (the M2.1 seam, NOT the whitelisted Bash tool -- arbitrary `npm test` / `tsc --noEmit` is allowed). Exit 0 -> completes. Non-zero -> the failure is injected as a reflective user turn (reflectiveVerifyFailure: command + exit + truncated output + "locate and fix; I'll re-run") and the loop continues an edit->test->fix cycle. Bounded by maxBounces (default 2); exceeding it (or running out of turns) ends with a `verification_failed` terminal state rather than a silent `completed`. Each check yields a `verification` LoopEvent; bounces also emit a `query.verify_bounce` profile mark. M3.3b -- CLI + eval: `myagent agent --verify "<command>"` parses into QueryOptions.verify (whitespace split). The CLI prints a `[verify]` line per check. A 7th eval task "self-correction" drives edit -> verify-fail -> fix -> verify-pass through an injected scripted mock executor (exit codes [1, 0]) -- fully deterministic, offline. The eval gate fingerprint updated (tasks 7, turns 15, in 10800, out 625). Determinism (invariant #2): the gate runs through ToolContext.executor, so the eval + the 4 new query-loop tests inject a mock CommandExecutor instead of spawning real processes. New TerminalState status "verification_failed" and a VerificationEvent added to the LoopEvent union. finalize critic pass stays deferred to a §4 follow-up. Local: 201 tests, 3/3 green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

wusijian007 merged commit 5df909d into main Jun 16, 2026
3 checks passed

wusijian007 deleted the feat/m3.3-self-correction branch June 16, 2026 13:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(verify): structural verification gate with edit->test->fix loop …#19

feat(verify): structural verification gate with edit->test->fix loop …#19
wusijian007 merged 1 commit into
mainfrom
feat/m3.3-self-correction

wusijian007 commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wusijian007 commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant