From 878124b5e175ab6e0cfcd7125a4cff4c896a221f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fran=C3=A7ois-Julien=20Alcaraz?= Date: Fri, 3 Jul 2026 16:02:14 +0100 Subject: [PATCH 1/4] Add review-workflow skill (SKILL.md, bundled review-loop workflow, tile.json) --- review-workflow/SKILL.md | 115 +++++++++ review-workflow/references/review-loop.mjs | 256 +++++++++++++++++++++ review-workflow/tile.json | 11 + 3 files changed, 382 insertions(+) create mode 100644 review-workflow/SKILL.md create mode 100644 review-workflow/references/review-loop.mjs create mode 100644 review-workflow/tile.json diff --git a/review-workflow/SKILL.md b/review-workflow/SKILL.md new file mode 100644 index 0000000..0c679f7 --- /dev/null +++ b/review-workflow/SKILL.md @@ -0,0 +1,115 @@ +--- +name: review-workflow +description: > + Run an iterative code-review loop: a review workflow scores the diff by severity, + Claude fixes every critical/major/minor issue (nits opportunistically), commits, then + re-reviews — looping until the review comes back clean. Project-agnostic; any git repo. + Use when the user asks to review-and-fix in a loop, do a feedback pass, or "keep + reviewing until clean". +invocations: + - /review-workflow +tags: + - code-review + - workflow + - git + - quality +version: 1.0.0 +--- + +# Review Workflow Skill + +Run an iterative **review → fix → commit → re-review** loop until a code review comes back clean. + +## How it works (two actors) + +A Claude Code **workflow** orchestrates subagents but **cannot edit the working tree or commit** — its +subagents run in isolated contexts. So the loop is split: + +| Step | Who | +|------|-----| +| Review the diff → verified, severity-graded findings | **the bundled workflow** (`references/review-loop.mjs`) | +| Apply the fixes | **you (the main agent)** | +| Commit round N | **you (the main agent)** | +| Re-review (round N+1) | **the bundled workflow** | + +You drive the outer loop; the workflow is the per-round review engine you call. The workflow is invoked +directly from this skill's bundled path — **nothing is written into the user's repo.** + +## Instructions + +When invoked with `/review-workflow $ARGUMENTS`: + +### 1. Prerequisites + +- Confirm this is a git repo (`git rev-parse --git-dir`). If not, tell the user and STOP. +- Determine the current branch. If on `main`/`master`, create a working branch first (do not commit review + fixes onto the default branch). +- Note this skill's directory — you need the absolute path to `references/review-loop.mjs` for the + `Workflow` call below. It sits next to this `SKILL.md`. + +### 2. Ask scope (once) + +Use **one** `AskUserQuestion` to pick what to review, then run the rest autonomously: + +| Choice | `scopeMode` | What it reviews | +|--------|-------------|-----------------| +| Branch diff vs base *(default)* | `branch` | The whole branch: `merge-base(HEAD, main/master)..HEAD`. Best for a feature-branch feedback pass. | +| Working-tree changes only | `working` | Staged + unstaged changes vs `HEAD`. | +| Specific paths | `paths` | Only the paths the user names (ask for them). | + +If the user already stated the scope in `$ARGUMENTS`, skip the question and use it. + +### 3. The loop (round N = 1, 2, …, up to 8) + +For each round N: + +1. **Review** — call the bundled workflow: + ``` + Workflow({ + scriptPath: "/references/review-loop.mjs", + args: { scopeMode, round: N, paths, focus } // include base only if the user pinned one + }) + ``` + It returns `{ round, base, scopeMode, files, confirmed, counts }` where `confirmed` is the + adversarially-verified findings (each `{ severity, file, location, title, detail, suggestedFix }`) + and `counts` is `{ critical, major, minor, nit }`. + +2. **Stop check** — if `counts.critical + counts.major + counts.minor === 0`, the review is clean → exit + the loop (still fix any nits opportunistically if trivial, then finish). + +3. **No-progress guard** — build the identity set of the current unresolved (critical/major/minor) + findings as `` `${file}::${title.trim().toLowerCase()}` ``. If it **equals** the previous round's set + (the same issues keep coming back), stop and report — you're not making progress. Healthy churn (a fix + surfacing *new* issues) changes the set and does not trip this. + +4. **Iteration guard** — if N === 8, stop and report the remaining findings. + +5. **Fix** — apply fixes for every `confirmed` critical/major/minor finding, using its `suggestedFix` as a + starting point (verify it's correct against the actual code — don't apply blindly). Also fix `nit` + findings when the change is low-risk and quick. + +6. **Commit** — stage and commit the round's fixes: + ```bash + git add -A && git commit -m "Address code review feedback (round N)" + ``` + **No Claude attribution** in the message or trailers. If the round produced no file changes, skip the + commit (and treat as no progress for the guard). + +7. Increment N and repeat from step 1. + +### 4. Completion report + +When the loop ends, summarize: +- Rounds run and why it stopped (**clean** / no-progress / hit 8 rounds). +- Issues fixed per severity across all rounds; commits made (one per round). +- If stopped by a guard: list the remaining findings (file, severity, title) so the user can decide. + +## Notes + +- **Project-agnostic** — no repo-specific rules baked in; the workflow reads the nearest `CLAUDE.md` and + treats its conventions as review criteria. +- **Reviewer agent** — the workflow prefers the `code-reviewer` agent (shipped by Claude Code's official + review plugins) and **falls back** to the default workflow subagent when it isn't available, so the loop + runs anywhere. +- **No attribution** — never add `Co-Authored-By: Claude` or "Generated with Claude Code" to commits. +- **Nothing is copied into the repo** — the workflow runs from this skill's bundled `references/` path. diff --git a/review-workflow/references/review-loop.mjs b/review-workflow/references/review-loop.mjs new file mode 100644 index 0000000..8314783 --- /dev/null +++ b/review-workflow/references/review-loop.mjs @@ -0,0 +1,256 @@ +export const meta = { + name: 'review-loop', + description: 'Adversarial multi-dimension review of a diff; verify findings, return them by severity. One round — invoke again after fixes (bump args.round) to continue the loop.', + whenToUse: + 'Reviewing a branch/PR diff or working-tree changes for bugs and quality. Fans out reviewers by dimension, adversarially verifies each finding to drop false positives, and returns findings by severity plus counts. Runs one round; invoke again after applying fixes (bump args.round) to continue. Project-agnostic — no repo-specific rules.', + phases: [ + { title: 'Scope', detail: 'derive base + changed files from git' }, + { title: 'Review', detail: 'parallel reviewers, one per dimension' }, + { title: 'Verify', detail: 'adversarially confirm each finding is real' }, + ], +} + +// --------------------------------------------------------------------------- +// Inputs (all optional — sensible git-derived defaults). Pass via Workflow args: +// { +// scopeMode, // 'branch' (default) | 'working' | 'paths' +// base, // explicit base ref (branch mode) +// files, // explicit file list (skips discovery) +// paths, // restrict discovery to these paths (paths mode) +// round, // review round number, for logging/labels +// focus, // free-text focus to weight the review +// excludeGlobs, // substring excludes overriding DEFAULT_EXCLUDES +// dimensions, // [{ key, prompt }] overriding DEFAULT_DIMENSIONS +// reviewerAgentType, // override the review-phase agent type +// } +// --------------------------------------------------------------------------- +const round = typeof args?.round === 'number' ? args.round : 1 +const scopeMode = args?.scopeMode ?? 'branch' + +// Generated / vendored paths that shouldn't be hand-reviewed. Override via +// args.excludeGlobs (substring match on the path). +const DEFAULT_EXCLUDES = [ + 'pnpm-lock.yaml', + 'package-lock.json', + 'yarn.lock', + 'Cargo.lock', + 'go.sum', + 'poetry.lock', + '.snap', + 'dist/', + 'build/', + '/generated/', + '.generated.', +] +const excludes = Array.isArray(args?.excludeGlobs) ? args.excludeGlobs : DEFAULT_EXCLUDES + +phase('Scope') + +// The review phase prefers the `code-reviewer` agent (shipped by Claude Code's +// official review plugins). It may be absent in some environments; probe once +// and fall back to the default workflow subagent so the loop runs anywhere. +const reviewerAgentType = + args?.reviewerAgentType ?? + (await agent( + `A workflow wants to spawn a subagent of type "code-reviewer". Is that agent type available in THIS environment? +Check for an agent named "code-reviewer": look under ~/.claude/agents, ~/.claude/plugins (enabled plugins' agents/ dirs), and any project .claude/agents directory. +Return "code-reviewer" if such an agent is available, otherwise return "default". Return ONLY that one word.`, + { label: 'scope:reviewer-agent', phase: 'Scope', agentType: 'Explore' }, + ).then((s) => { + const v = (s ?? '').trim().toLowerCase() + return v.includes('code-reviewer') ? 'code-reviewer' : 'default' + })) + +// Build the review target as a shell diff spec + the file-discovery command, +// depending on scope mode. `branch` diffs base..HEAD; `working` diffs the +// working tree (staged + unstaged) against HEAD; `paths` restricts to args.paths. +const paths = Array.isArray(args?.paths) ? args.paths : [] +const pathArgs = paths.length ? ' -- ' + paths.map((p) => `'${p}'`).join(' ') : '' + +// Resolve the base ref (branch mode only): explicit arg > merge-base with the +// first existing default branch > HEAD~1 fallback. +let base = args?.base ?? null +if (scopeMode === 'branch' && !base) { + base = await agent( + `Determine the git base ref to diff this branch against, for a code review of "the work on this branch". +Run these and reason about the output: + git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null # often points at the default branch + git merge-base HEAD origin/main 2>/dev/null || git merge-base HEAD main 2>/dev/null || git merge-base HEAD origin/master 2>/dev/null || git merge-base HEAD master 2>/dev/null + git log --oneline -1 +Prefer the merge-base of HEAD with the repo's default branch (main/master, remote or local). If none resolve, use HEAD~1. +Return ONLY the resolved base commit SHA or ref — no prose.`, + { label: 'scope:base', phase: 'Scope', agentType: 'Explore' }, + ).then((s) => (s ?? '').trim().split(/\s+/).pop()) +} + +// The diff spec used in every reviewer/verifier prompt. +const diffSpec = + scopeMode === 'working' + ? `git diff HEAD${pathArgs} # working-tree changes (staged + unstaged)` + : scopeMode === 'paths' + ? `git diff HEAD${pathArgs} # changes restricted to the requested paths` + : `git diff ${base}..HEAD${pathArgs}` + +const nameOnlyCmd = + scopeMode === 'working' || scopeMode === 'paths' + ? `git diff --name-only HEAD${pathArgs}` + : `git diff --name-only ${base}..HEAD${pathArgs}` + +// Resolve the changed-file list: explicit arg > name-only diff minus excludes. +const files = + Array.isArray(args?.files) && args.files.length > 0 + ? args.files + : await agent( + `List the source files changed for code review. +Run: ${nameOnlyCmd} +Then DROP any path containing any of these substrings (generated/vendored, not hand-reviewed): +${excludes.map((e) => ' - ' + e).join('\n')} +Return the remaining repo-relative paths. If none remain, return an empty list.`, + { + label: 'scope:files', + phase: 'Scope', + agentType: 'Explore', + // StructuredOutput requires a top-level object schema, so wrap the list. + schema: { + type: 'object', + additionalProperties: false, + required: ['files'], + properties: { files: { type: 'array', items: { type: 'string' } } }, + }, + }, + ).then((r) => r?.files ?? []) + +if (!files || files.length === 0) { + log('No reviewable (non-generated) files changed — nothing to review.') + return { round, base, scopeMode, files: [], confirmed: [], counts: { critical: 0, major: 0, minor: 0, nit: 0 } } +} + +log(`Round ${round}: reviewing ${files.length} file(s) [${scopeMode}] with ${reviewerAgentType} agent`) + +const FINDING_SCHEMA = { + type: 'object', + additionalProperties: false, + required: ['findings'], + properties: { + findings: { + type: 'array', + items: { + type: 'object', + additionalProperties: false, + required: ['severity', 'file', 'title', 'detail', 'suggestedFix'], + properties: { + severity: { type: 'string', enum: ['critical', 'major', 'minor', 'nit'] }, + file: { type: 'string' }, + location: { type: 'string', description: 'symbol or line hint' }, + title: { type: 'string' }, + detail: { type: 'string', description: 'why it is a problem' }, + suggestedFix: { type: 'string' }, + }, + }, + }, + }, +} + +const VERDICT_SCHEMA = { + type: 'object', + additionalProperties: false, + required: ['isReal', 'severity', 'reason'], + properties: { + isReal: { type: 'boolean' }, + severity: { type: 'string', enum: ['critical', 'major', 'minor', 'nit'] }, + reason: { type: 'string' }, + }, +} + +// Shared context every reviewer/verifier gets. Reviewers read the diff AND the +// full current file (rewritten files aren't captured by hunks alone), plus the +// nearest CLAUDE.md for project conventions — so no conventions need to be +// pasted in here. +const focusBlock = args?.focus ? `\nFOCUS FROM THE REQUESTER (weight these):\n${args.focus}\n` : '' +const CONTEXT = `Code review, round ${round}. + +SCOPE — restricted to these files: +${files.map((f) => ' - ' + f).join('\n')} + +To review, run and read: + git log --oneline -5 + ${diffSpec} +Then READ THE FULL CURRENT CONTENT of each file (a rewritten file's hunks hide context). +Ignore generated/vendored files. + +PROJECT CONVENTIONS: read the nearest CLAUDE.md (repo root and any closer to the +changed files) and treat its rules as review criteria — flag violations. If no +CLAUDE.md exists, apply general best practices for the language/framework in use. +${focusBlock} +SEVERITY: critical = data loss/crash/security/wrong results shipped; major = real +bug or convention violation with user impact; minor = correctness-neutral quality +(naming, dead code, weak tests, a11y polish); nit = trivial. Do NOT invent work — +only report what you can point to in the actual code.` + +// Default review lenses. Override with args.dimensions: [{ key, prompt }]. +const DEFAULT_DIMENSIONS = [ + { + key: 'correctness', + prompt: + 'You are a CORRECTNESS reviewer. Hunt for logic bugs, off-by-one and boundary errors, timezone/locale issues in date math, null/undefined handling, incorrect async/await, race conditions, and whether tests actually assert the intended invariants (vs. tautological or overly-permissive assertions). Return findings by severity.', + }, + { + key: 'conventions', + prompt: + 'You are a CONVENTIONS/quality reviewer. Enforce the CLAUDE.md rules; flag DRY violations, reimplementations of existing shared utilities, dead code, unsafe type casts, poor naming, and framework anti-patterns. Return findings by severity.', + }, + { + key: 'robustness', + prompt: + 'You are a ROBUSTNESS/perf/security reviewer. Consider unbounded queries/memory, N+1 and hot-loop costs, missing error/loading/empty states, input validation at boundaries, accessibility, and any injection/authz/secret-handling risks. Return findings by severity.', + }, +] +const dimensions = Array.isArray(args?.dimensions) && args.dimensions.length > 0 ? args.dimensions : DEFAULT_DIMENSIONS + +// The reviewer opts: use the resolved agent type, but only pass agentType when +// it's a real custom agent — 'default' means "let the workflow pick the default +// subagent" (omit the field entirely). +const reviewerOpts = (key) => { + const o = { label: `review:${key}`, phase: 'Review', schema: FINDING_SCHEMA } + if (reviewerAgentType !== 'default') o.agentType = reviewerAgentType + return o +} + +phase('Review') +const reviews = await parallel(dimensions.map((d) => () => agent(`${CONTEXT}\n\n${d.prompt}`, reviewerOpts(d.key)))) + +const allFindings = reviews.filter(Boolean).flatMap((r) => r.findings ?? []) +log(`Round ${round}: ${allFindings.length} raw findings from ${dimensions.length} reviewers`) + +if (allFindings.length === 0) { + return { round, base, scopeMode, files, confirmed: [], counts: { critical: 0, major: 0, minor: 0, nit: 0 } } +} + +phase('Verify') +const verified = await parallel( + allFindings.map((f) => () => + agent( + `${CONTEXT}\n\nAdversarially VERIFY this single finding. Read the actual code and decide whether it is a REAL issue worth fixing. Default to isReal=false if it is speculative, already handled elsewhere, a false positive, an intentional/documented tradeoff, or contradicts the project conventions. Re-grade severity honestly (a "critical" that's really cosmetic should come back minor/nit).\n\nFINDING:\nseverity=${f.severity}\nfile=${f.file}\nlocation=${f.location ?? ''}\ntitle=${f.title}\ndetail=${f.detail}\nsuggestedFix=${f.suggestedFix}`, + { + label: `verify:${f.severity}:${String(f.file).split('/').pop()}`, + phase: 'Verify', + schema: VERDICT_SCHEMA, + agentType: 'Explore', + }, + ).then((v) => (v && v.isReal ? { ...f, severity: v.severity, verifyReason: v.reason } : null)), + ), +) + +const confirmed = verified.filter(Boolean) +const order = { critical: 0, major: 1, minor: 2, nit: 3 } +confirmed.sort((a, b) => order[a.severity] - order[b.severity]) +log(`Round ${round}: ${confirmed.length} confirmed after adversarial verify`) + +const counts = { + critical: confirmed.filter((f) => f.severity === 'critical').length, + major: confirmed.filter((f) => f.severity === 'major').length, + minor: confirmed.filter((f) => f.severity === 'minor').length, + nit: confirmed.filter((f) => f.severity === 'nit').length, +} + +return { round, base, scopeMode, files, confirmed, counts } diff --git a/review-workflow/tile.json b/review-workflow/tile.json new file mode 100644 index 0000000..dd42ef9 --- /dev/null +++ b/review-workflow/tile.json @@ -0,0 +1,11 @@ +{ + "name": "hefgi/review-workflow", + "version": "1.0.0", + "private": false, + "summary": "Iterative code-review loop — a review workflow scores the diff by severity; Claude fixes critical/major/minor issues, commits each round, and re-reviews until clean. Project-agnostic, any git repo.", + "skills": { + "review-workflow": { + "path": "SKILL.md" + } + } +} From b8c3fd03d7ae17cc64a1379379872518b5bc25d5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fran=C3=A7ois-Julien=20Alcaraz?= Date: Fri, 3 Jul 2026 16:02:19 +0100 Subject: [PATCH 2/4] Register review-workflow in marketplace, README, and publish workflow --- .claude-plugin/marketplace.json | 9 +++++++++ .github/workflows/publish-tile.yml | 9 +++++++++ README.md | 1 + 3 files changed, 19 insertions(+) diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 42d8f92..2b3a71a 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -26,6 +26,15 @@ "skills": [ "./ponder" ] + }, + { + "name": "review-workflow", + "description": "Iterative code-review loop — review the diff, fix critical/major/minor issues, commit each round, re-review until clean. Project-agnostic.", + "source": "./", + "strict": false, + "skills": [ + "./review-workflow" + ] } ] } diff --git a/.github/workflows/publish-tile.yml b/.github/workflows/publish-tile.yml index d129edf..64fa485 100644 --- a/.github/workflows/publish-tile.yml +++ b/.github/workflows/publish-tile.yml @@ -27,3 +27,12 @@ jobs: with: token: ${{ secrets.TESSL_API_TOKEN }} path: './ponder' + + publish-review-workflow: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v6 + - uses: tesslio/publish@main + with: + token: ${{ secrets.TESSL_API_TOKEN }} + path: './review-workflow' diff --git a/README.md b/README.md index c64e415..16b414b 100644 --- a/README.md +++ b/README.md @@ -8,6 +8,7 @@ A collection of AI coding agent skills for various tools and services. |-------|-------------|------------|-------| | [shortcut](./shortcut/) | Interact with Shortcut stories and epics via the `short` CLI | `/shortcut` | [![tessl](https://img.shields.io/endpoint?url=https%3A%2F%2Fapi.tessl.io%2Fv1%2Fbadges%2Fhefgi%2Fshortcut)](https://tessl.io/registry/hefgi/shortcut) | | [ponder](./ponder/) | Build EVM blockchain data indexers using Ponder (ponder.sh) | `/ponder` | [![tessl](https://img.shields.io/endpoint?url=https%3A%2F%2Fapi.tessl.io%2Fv1%2Fbadges%2Fhefgi%2Fponder)](https://tessl.io/registry/hefgi/ponder) | +| [review-workflow](./review-workflow/) | Iterative code-review loop until the review comes back clean | `/review-workflow` | [![tessl](https://img.shields.io/endpoint?url=https%3A%2F%2Fapi.tessl.io%2Fv1%2Fbadges%2Fhefgi%2Freview-workflow)](https://tessl.io/registry/hefgi/review-workflow) | ## Installation From f31eed9ec71cb4bbc5ba796a5757cd99906fc79f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fran=C3=A7ois-Julien=20Alcaraz?= Date: Fri, 3 Jul 2026 16:23:03 +0100 Subject: [PATCH 3/4] Add cheap-to-smart model escalation ladder to review-workflow Mechanical phases (scope, file-listing) run on Haiku; the review ladder converges on Sonnet then re-converges on Opus for the final sign-off, with each round's verify matching its review tier. The no-progress guard resets per tier so a smarter model surfacing new findings counts as progress. --- review-workflow/SKILL.md | 52 +++++++++++++++------- review-workflow/references/review-loop.mjs | 44 +++++++++++++----- 2 files changed, 71 insertions(+), 25 deletions(-) diff --git a/review-workflow/SKILL.md b/review-workflow/SKILL.md index 0c679f7..c123f60 100644 --- a/review-workflow/SKILL.md +++ b/review-workflow/SKILL.md @@ -31,6 +31,7 @@ subagents run in isolated contexts. So the loop is split: | Apply the fixes | **you (the main agent)** | | Commit round N | **you (the main agent)** | | Re-review (round N+1) | **the bundled workflow** | +| Escalate models cheap → smart | **you (the main agent)** — Sonnet loop, then Opus loop | You drive the outer loop; the workflow is the per-round review engine you call. The workflow is invoked directly from this skill's bundled path — **nothing is written into the user's repo.** @@ -59,30 +60,46 @@ Use **one** `AskUserQuestion` to pick what to review, then run the rest autonomo If the user already stated the scope in `$ARGUMENTS`, skip the question and use it. -### 3. The loop (round N = 1, 2, …, up to 8) +### 3. The escalation ladder (cheap → smart) -For each round N: +Run the review loop on progressively smarter models: converge on **Sonnet**, then re-converge on **Opus**. +The cheap mechanical work (scope, file-listing) always runs on **Haiku** inside the workflow. The idea: +Sonnet clears the obvious issues cheaply; Opus does the final, smartest sign-off. Opus clean = done. + +``` +tiers = ["sonnet", "opus"] +globalRound = 0 +for tier in tiers: + run the per-tier loop below with reviewModel = tier # reset the no-progress guard at each tier +``` + +### 4. The per-tier loop (round-by-round, shared cap of 8 rounds across all tiers) + +For each round in the current `tier`: 1. **Review** — call the bundled workflow: ``` Workflow({ scriptPath: "/references/review-loop.mjs", - args: { scopeMode, round: N, paths, focus } // include base only if the user pinned one + args: { scopeMode, round: ++globalRound, paths, focus, reviewModel: tier } + // include base only if the user pinned one; mechanicsModel defaults to haiku }) ``` It returns `{ round, base, scopeMode, files, confirmed, counts }` where `confirmed` is the adversarially-verified findings (each `{ severity, file, location, title, detail, suggestedFix }`) - and `counts` is `{ critical, major, minor, nit }`. + and `counts` is `{ critical, major, minor, nit }`. The review **and** its per-finding verify both run on + `tier`. -2. **Stop check** — if `counts.critical + counts.major + counts.minor === 0`, the review is clean → exit - the loop (still fix any nits opportunistically if trivial, then finish). +2. **Tier-clean check** — if `counts.critical + counts.major + counts.minor === 0`, this tier is clean → + **break** to the next tier (or, if this was the last tier, finish). Fix trivial nits opportunistically + before moving on. -3. **No-progress guard** — build the identity set of the current unresolved (critical/major/minor) - findings as `` `${file}::${title.trim().toLowerCase()}` ``. If it **equals** the previous round's set - (the same issues keep coming back), stop and report — you're not making progress. Healthy churn (a fix - surfacing *new* issues) changes the set and does not trip this. +3. **No-progress guard** — build the identity set of the current unresolved (critical/major/minor) findings + as `` `${file}::${title.trim().toLowerCase()}` ``. If it **equals the previous round's set within this + tier**, stop the whole loop and report — the same issues keep coming back. **Reset this set when a new + tier starts** (a smarter model is expected to surface different findings; that is progress, not a stall). -4. **Iteration guard** — if N === 8, stop and report the remaining findings. +4. **Iteration guard** — if `globalRound === 8`, stop and report the remaining findings. 5. **Fix** — apply fixes for every `confirmed` critical/major/minor finding, using its `suggestedFix` as a starting point (verify it's correct against the actual code — don't apply blindly). Also fix `nit` @@ -90,17 +107,17 @@ For each round N: 6. **Commit** — stage and commit the round's fixes: ```bash - git add -A && git commit -m "Address code review feedback (round N)" + git add -A && git commit -m "Address code review feedback (round , )" ``` **No Claude attribution** in the message or trailers. If the round produced no file changes, skip the commit (and treat as no progress for the guard). -7. Increment N and repeat from step 1. +7. Repeat from step 1. -### 4. Completion report +### 5. Completion report When the loop ends, summarize: -- Rounds run and why it stopped (**clean** / no-progress / hit 8 rounds). +- Tiers run (sonnet → opus), total rounds, and why it stopped (**Opus clean** / no-progress / hit 8 rounds). - Issues fixed per severity across all rounds; commits made (one per round). - If stopped by a guard: list the remaining findings (file, severity, title) so the user can decide. @@ -111,5 +128,10 @@ When the loop ends, summarize: - **Reviewer agent** — the workflow prefers the `code-reviewer` agent (shipped by Claude Code's official review plugins) and **falls back** to the default workflow subagent when it isn't available, so the loop runs anywhere. +- **Model tiering (cost-efficient)** — mechanical work (scope, file-listing) runs on **Haiku**; the review + ladder converges on **Sonnet** first, then **Opus** for the final sign-off. Each review round's verify + phase runs on the same tier as its review. Pass `reviewModel`/`mechanicsModel` in `args` to override. + Note: passing `reviewModel` overrides the `code-reviewer` agent's own model for that call, so the tier is + applied uniformly. - **No attribution** — never add `Co-Authored-By: Claude` or "Generated with Claude Code" to commits. - **Nothing is copied into the repo** — the workflow runs from this skill's bundled `references/` path. diff --git a/review-workflow/references/review-loop.mjs b/review-workflow/references/review-loop.mjs index 8314783..3378946 100644 --- a/review-workflow/references/review-loop.mjs +++ b/review-workflow/references/review-loop.mjs @@ -22,11 +22,26 @@ export const meta = { // excludeGlobs, // substring excludes overriding DEFAULT_EXCLUDES // dimensions, // [{ key, prompt }] overriding DEFAULT_DIMENSIONS // reviewerAgentType, // override the review-phase agent type +// mechanicsModel, // model for scope/file-listing/probe (default 'haiku') +// reviewModel, // model for the review + verify phases (this tier); e.g. +// // 'sonnet' then 'opus' across the skill's escalation ladder // } // --------------------------------------------------------------------------- const round = typeof args?.round === 'number' ? args.round : 1 const scopeMode = args?.scopeMode ?? 'branch' +// Model tiering (set by the driving skill's escalation ladder): +// mechanicsModel — scope resolution, file listing, reviewer-agent probe. Cheap +// mechanical work; defaults to haiku. +// reviewModel — the review AND verify phases for this tier. The skill runs +// the loop on sonnet until clean, then re-runs on opus. Defaults to the +// inherited session model when unset (omit the field so agents inherit). +const mechanicsModel = args?.mechanicsModel ?? 'haiku' +const reviewModel = args?.reviewModel ?? null +// Only attach a `model` opt when we actually have one — otherwise the agent +// inherits the session model (the Workflow contract's default). +const withModel = (opts, model) => (model ? { ...opts, model } : opts) + // Generated / vendored paths that shouldn't be hand-reviewed. Override via // args.excludeGlobs (substring match on the path). const DEFAULT_EXCLUDES = [ @@ -55,7 +70,7 @@ const reviewerAgentType = `A workflow wants to spawn a subagent of type "code-reviewer". Is that agent type available in THIS environment? Check for an agent named "code-reviewer": look under ~/.claude/agents, ~/.claude/plugins (enabled plugins' agents/ dirs), and any project .claude/agents directory. Return "code-reviewer" if such an agent is available, otherwise return "default". Return ONLY that one word.`, - { label: 'scope:reviewer-agent', phase: 'Scope', agentType: 'Explore' }, + withModel({ label: 'scope:reviewer-agent', phase: 'Scope', agentType: 'Explore' }, mechanicsModel), ).then((s) => { const v = (s ?? '').trim().toLowerCase() return v.includes('code-reviewer') ? 'code-reviewer' : 'default' @@ -79,7 +94,7 @@ Run these and reason about the output: git log --oneline -1 Prefer the merge-base of HEAD with the repo's default branch (main/master, remote or local). If none resolve, use HEAD~1. Return ONLY the resolved base commit SHA or ref — no prose.`, - { label: 'scope:base', phase: 'Scope', agentType: 'Explore' }, + withModel({ label: 'scope:base', phase: 'Scope', agentType: 'Explore' }, mechanicsModel), ).then((s) => (s ?? '').trim().split(/\s+/).pop()) } @@ -110,6 +125,7 @@ Return the remaining repo-relative paths. If none remain, return an empty list.` label: 'scope:files', phase: 'Scope', agentType: 'Explore', + model: mechanicsModel, // StructuredOutput requires a top-level object schema, so wrap the list. schema: { type: 'object', @@ -125,7 +141,9 @@ if (!files || files.length === 0) { return { round, base, scopeMode, files: [], confirmed: [], counts: { critical: 0, major: 0, minor: 0, nit: 0 } } } -log(`Round ${round}: reviewing ${files.length} file(s) [${scopeMode}] with ${reviewerAgentType} agent`) +log( + `Round ${round}: reviewing ${files.length} file(s) [${scopeMode}] with ${reviewerAgentType} agent on ${reviewModel ?? 'session'} model`, +) const FINDING_SCHEMA = { type: 'object', @@ -213,7 +231,10 @@ const dimensions = Array.isArray(args?.dimensions) && args.dimensions.length > 0 const reviewerOpts = (key) => { const o = { label: `review:${key}`, phase: 'Review', schema: FINDING_SCHEMA } if (reviewerAgentType !== 'default') o.agentType = reviewerAgentType - return o + // The tier's review model. Note: a custom agent (code-reviewer) has its own + // model in its definition; passing model here overrides it for this call so + // the escalation ladder controls the tier uniformly. + return withModel(o, reviewModel) } phase('Review') @@ -231,12 +252,15 @@ const verified = await parallel( allFindings.map((f) => () => agent( `${CONTEXT}\n\nAdversarially VERIFY this single finding. Read the actual code and decide whether it is a REAL issue worth fixing. Default to isReal=false if it is speculative, already handled elsewhere, a false positive, an intentional/documented tradeoff, or contradicts the project conventions. Re-grade severity honestly (a "critical" that's really cosmetic should come back minor/nit).\n\nFINDING:\nseverity=${f.severity}\nfile=${f.file}\nlocation=${f.location ?? ''}\ntitle=${f.title}\ndetail=${f.detail}\nsuggestedFix=${f.suggestedFix}`, - { - label: `verify:${f.severity}:${String(f.file).split('/').pop()}`, - phase: 'Verify', - schema: VERDICT_SCHEMA, - agentType: 'Explore', - }, + withModel( + { + label: `verify:${f.severity}:${String(f.file).split('/').pop()}`, + phase: 'Verify', + schema: VERDICT_SCHEMA, + agentType: 'Explore', + }, + reviewModel, + ), ).then((v) => (v && v.isReal ? { ...f, severity: v.severity, verifyReason: v.reason } : null)), ), ) From f5d4efe4ff3ac51a19017d6154473e80cb67ca4f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fran=C3=A7ois-Julien=20Alcaraz?= Date: Fri, 3 Jul 2026 16:29:36 +0100 Subject: [PATCH 4/4] Make review-workflow model ladder configurable via invocation args Accept a preset (sonnet-opus, sonnet, opus) or an explicit models= list; when none is given, ask the user with Sonnet -> Opus as the recommended default, plus Sonnet-only and Opus-only. The chosen list becomes the review tiers. --- review-workflow/SKILL.md | 50 ++++++++++++++++++++++++++++------------ 1 file changed, 35 insertions(+), 15 deletions(-) diff --git a/review-workflow/SKILL.md b/review-workflow/SKILL.md index c123f60..4874730 100644 --- a/review-workflow/SKILL.md +++ b/review-workflow/SKILL.md @@ -3,9 +3,10 @@ name: review-workflow description: > Run an iterative code-review loop: a review workflow scores the diff by severity, Claude fixes every critical/major/minor issue (nits opportunistically), commits, then - re-reviews — looping until the review comes back clean. Project-agnostic; any git repo. - Use when the user asks to review-and-fix in a loop, do a feedback pass, or "keep - reviewing until clean". + re-reviews — looping until the review comes back clean. Runs a configurable model + ladder (default Sonnet → Opus; also sonnet, opus, or a custom models= list). + Project-agnostic; any git repo. Use when the user asks to review-and-fix in a loop, + do a feedback pass, or "keep reviewing until clean". invocations: - /review-workflow tags: @@ -31,7 +32,7 @@ subagents run in isolated contexts. So the loop is split: | Apply the fixes | **you (the main agent)** | | Commit round N | **you (the main agent)** | | Re-review (round N+1) | **the bundled workflow** | -| Escalate models cheap → smart | **you (the main agent)** — Sonnet loop, then Opus loop | +| Escalate models cheap → smart | **you (the main agent)** — run each tier of the chosen ladder to clean | You drive the outer loop; the workflow is the per-round review engine you call. The workflow is invoked directly from this skill's bundled path — **nothing is written into the user's repo.** @@ -60,14 +61,31 @@ Use **one** `AskUserQuestion` to pick what to review, then run the rest autonomo If the user already stated the scope in `$ARGUMENTS`, skip the question and use it. -### 3. The escalation ladder (cheap → smart) +### 3. Choose the model suite (the escalation ladder) -Run the review loop on progressively smarter models: converge on **Sonnet**, then re-converge on **Opus**. -The cheap mechanical work (scope, file-listing) always runs on **Haiku** inside the workflow. The idea: -Sonnet clears the obvious issues cheaply; Opus does the final, smartest sign-off. Opus clean = done. +The review runs on a **ladder** of models, cheap → smart: it converges on the first tier, then re-converges +on the next, and so on. Mechanical work (scope, file-listing) always runs on **Haiku** inside the workflow +regardless of the ladder. +Resolve `tiers` (an ordered list of review models) from `$ARGUMENTS`: + +| In `$ARGUMENTS` | Resolved `tiers` | +|-----------------|------------------| +| `sonnet-opus` *(preset)* | `["sonnet", "opus"]` | +| `sonnet` *(preset)* | `["sonnet"]` | +| `opus` *(preset)* | `["opus"]` | +| `models=` (explicit list) | that list, e.g. `models=haiku,sonnet,opus` → `["haiku","sonnet","opus"]` | + +If no model suite is present in `$ARGUMENTS`, ask with **one** `AskUserQuestion` offering exactly these +options (recommended one first): + +- **Sonnet → Opus** *(Recommended)* — Sonnet clears the obvious issues cheaply, then Opus does the final, + smartest sign-off. → `["sonnet", "opus"]` +- **Sonnet only** — single cheaper/faster pass. → `["sonnet"]` +- **Opus only** — max quality from round one, higher cost. → `["opus"]` + +Then run the ladder: ``` -tiers = ["sonnet", "opus"] globalRound = 0 for tier in tiers: run the per-tier loop below with reviewModel = tier # reset the no-progress guard at each tier @@ -117,7 +135,8 @@ For each round in the current `tier`: ### 5. Completion report When the loop ends, summarize: -- Tiers run (sonnet → opus), total rounds, and why it stopped (**Opus clean** / no-progress / hit 8 rounds). +- The ladder run (e.g. sonnet → opus), total rounds, and why it stopped (**final tier clean** / no-progress + / hit 8 rounds). - Issues fixed per severity across all rounds; commits made (one per round). - If stopped by a guard: list the remaining findings (file, severity, title) so the user can decide. @@ -128,10 +147,11 @@ When the loop ends, summarize: - **Reviewer agent** — the workflow prefers the `code-reviewer` agent (shipped by Claude Code's official review plugins) and **falls back** to the default workflow subagent when it isn't available, so the loop runs anywhere. -- **Model tiering (cost-efficient)** — mechanical work (scope, file-listing) runs on **Haiku**; the review - ladder converges on **Sonnet** first, then **Opus** for the final sign-off. Each review round's verify - phase runs on the same tier as its review. Pass `reviewModel`/`mechanicsModel` in `args` to override. - Note: passing `reviewModel` overrides the `code-reviewer` agent's own model for that call, so the tier is - applied uniformly. +- **Model tiering (cost-efficient, configurable)** — mechanical work (scope, file-listing) runs on + **Haiku**; the review ladder is chosen at invocation (default **Sonnet → Opus**; also `sonnet`, `opus`, + or an explicit `models=…` list — see step 3). Each review round's verify phase runs on the same tier as + its review. The skill passes `reviewModel`/`mechanicsModel` into the workflow `args`. Note: passing + `reviewModel` overrides the `code-reviewer` agent's own model for that call, so the tier is applied + uniformly. - **No attribution** — never add `Co-Authored-By: Claude` or "Generated with Claude Code" to commits. - **Nothing is copied into the repo** — the workflow runs from this skill's bundled `references/` path.