From caeb8071a606ffd2b2df57eb47861216641be53d Mon Sep 17 00:00:00 2001 From: Baptiste LAFOURCADE Date: Wed, 1 Jul 2026 17:30:27 +0200 Subject: [PATCH 1/3] fix(dev): close review report contract gaps Address four gaps an automated review flagged on the anti-slop redesign: - 03-review-relevancy regains its mandatory `## Output` (R8 action anatomy). - Single-axis runs: an axis not run marks the sections it owns "Not run" (functional owns Phases and Verification), never a placeholder or invented data, so review-code alone stays valid. - Unmet acceptance criteria tagged `fix` now also append a `functional` row to Findings, so the header count matches the verdict instead of reading 0. - Verdict: a `not-applicable` or `fixed` criterion no longer blocks `approve`; only an unchecked `fix` criterion does. Co-Authored-By: Claude Opus 4.8 --- plugins/aidd-dev/skills/05-review/SKILL.md | 1 + .../skills/05-review/actions/02-review-functional.md | 8 +++++--- .../skills/05-review/actions/03-review-relevancy.md | 4 ++++ .../aidd-dev/skills/05-review/assets/review-template.md | 6 ++++-- .../aidd-dev/skills/05-review/references/review-rubric.md | 4 ++-- 5 files changed, 16 insertions(+), 7 deletions(-) diff --git a/plugins/aidd-dev/skills/05-review/SKILL.md b/plugins/aidd-dev/skills/05-review/SKILL.md index f653f9be..69c5e89b 100644 --- a/plugins/aidd-dev/skills/05-review/SKILL.md +++ b/plugins/aidd-dev/skills/05-review/SKILL.md @@ -26,6 +26,7 @@ Run all three by default, composing one report. Run a single axis only when the - Folder: write into the reviewed work's feature folder (`aidd_docs/tasks//_/`, beside `plan.md`), or one resolved from the change when it has none. - Report: fill `review.md` from `assets/review-template.md`. One shared `Findings` table for every axis: functional writes the `Phases` boxes and the `Verification` table, code and relevancy append rows to `Findings` under their `Kind`. Tables and boxes, no prose, no per-axis sections. The Phase column ties a finding to the plan when one is in scope, `-` otherwise. - Sections: the report has exactly the sections in `assets/review-validator.yml`. Before returning, verify against it and remove any section not listed. +- Not run: every required section always exists. An axis that did not run marks the sections it owns "Not run" (functional owns `Phases` and `Verification`); it never leaves a placeholder or invents data. - Re-run: overwrite `review.md` with the current review. It is a snapshot of the current diff, not a history; a later review of the same work replaces the earlier one. - Verdict: one overall verdict, the strictest across the axes run, per `references/review-rubric.md`. diff --git a/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md b/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md index 62e133ae..67628ffe 100644 --- a/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md +++ b/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md @@ -8,17 +8,19 @@ The plan path holding the phases and their acceptance criteria, from the argumen ## Output -The `Phases` and `Verification` sections of the feature folder's `review.md`: one block per plan phase with a box per acceptance criterion, and the verification summary. +The `Phases` and `Verification` sections of the feature folder's `review.md` (one block per plan phase with a box per acceptance criterion, and the verification summary), plus a `functional` row in `Findings` for each unmet criterion tagged `fix`. ## Process -1. **Read.** Take the plan from the arguments; if absent, ask for the acceptance criteria, and mark the `Phases` section "Not run" when none are available. Static review only, no app execution or browser. +1. **Read.** Take the plan from the arguments; if absent, ask for the acceptance criteria, and mark the `Phases` and `Verification` sections "Not run" when none are available. Static review only, no app execution or browser. 2. **Trace.** Fetch the diff, then walk each phase in plan order. For every acceptance criterion, check the box `[x]` only when the diff shows evidence, citing the `file:line`; leave it `[ ]` when unmet or partial, naming the gap. 3. **Summarize.** Fill `Verification`: the percent and count of checked criteria, the files checked, every unchecked criterion tagged `fix`, `not-applicable`, or `fixed`, and any unplanned change in the diff that traces to no criterion. Write "none" for an empty list. -4. **Record.** Write the phase blocks and the verification summary into `review.md`. No prose paragraphs, boxes and the summary only. +4. **Count.** For each unmet criterion tagged `fix`, append a `functional` row to the `Findings` table (its Phase, the gap, a severity) so the header count matches the verdict. A `not-applicable` or `fixed` criterion adds no row. +5. **Record.** Write the phase blocks, the functional findings, and the verification summary into `review.md`. No prose paragraphs, boxes and the summary only. ## Test - The `Phases` section holds one block per plan phase, every acceptance criterion a checked or unchecked box citing evidence or a gap. +- Every unmet criterion tagged `fix` has a matching `functional` row in `Findings`; `not-applicable` criteria have none. - The `Verification` section reports the verified percent, the files checked, a tag on each unchecked criterion, and the unplanned changes (or "none"). - No code is patched. diff --git a/plugins/aidd-dev/skills/05-review/actions/03-review-relevancy.md b/plugins/aidd-dev/skills/05-review/actions/03-review-relevancy.md index 97789901..88108dba 100644 --- a/plugins/aidd-dev/skills/05-review/actions/03-review-relevancy.md +++ b/plugins/aidd-dev/skills/05-review/actions/03-review-relevancy.md @@ -6,6 +6,10 @@ Judge whether the diff belongs (serves the need, conforms to the rules, no rot) The diff to review (a git ref range or path; defaults to the diff against the repository default branch), the need it serves (the plan objective or the ticket), the project's declared rules discovered at runtime, and the plan when in scope. +## Output + +Misfit rows in the `Findings` table of the feature folder's `review.md`, each under a lens (`fit`, `conform`, `rot`), tied to evidence. + ## Process 1. **Gather.** Resolve the diff, otherwise the diff against the repository default branch. Capture the need from the plan objective or the ticket. Discover the declared rules at runtime, never hardcoded. Fall back cleanly when a source is absent. diff --git a/plugins/aidd-dev/skills/05-review/assets/review-template.md b/plugins/aidd-dev/skills/05-review/assets/review-template.md index 4a2a7fb6..6c5b674b 100644 --- a/plugins/aidd-dev/skills/05-review/assets/review-template.md +++ b/plugins/aidd-dev/skills/05-review/assets/review-template.md @@ -9,7 +9,7 @@ ## Phases - + ### Phase {{n}} — {{phase-name}} @@ -18,13 +18,15 @@ ## Findings - + | Sev | Kind | Phase | Location | Issue | Fix | | --- | ---- | ----- | -------- | ----- | --- | ## Verification + + | Metric | Value | | ------------- | ------------------------------------------------- | | Verified | {{pct}}% ({{n_checked}}/{{n_total}}) | diff --git a/plugins/aidd-dev/skills/05-review/references/review-rubric.md b/plugins/aidd-dev/skills/05-review/references/review-rubric.md index dae68188..126eb829 100644 --- a/plugins/aidd-dev/skills/05-review/references/review-rubric.md +++ b/plugins/aidd-dev/skills/05-review/references/review-rubric.md @@ -12,11 +12,11 @@ Shared definitions for the three review axes. The actions and the report templat One overall verdict, the strictest across the axes run: -- `approve`: no critical finding, and every acceptance criterion checked, ship it. +- `approve`: no critical finding, and no criterion left unchecked and tagged `fix`, ship it. A `not-applicable` or `fixed` criterion does not block. - `changes-requested`: warnings, a fixable critical, or any unchecked criterion tagged `fix`. - `blocked`: a critical that must not merge, or an unchecked critical criterion. -An unchecked criterion in the `Phases` section is a functional finding: one tagged `fix` cannot yield `approve`. +An unchecked criterion tagged `fix` is a functional finding: it appears as a `functional` row in `Findings` (so it counts) and cannot yield `approve`. A `not-applicable` criterion is neither a finding nor a blocker. ## Code categories From c09aaaa3013e00e625bcc761cca14fc0563b4873 Mon Sep 17 00:00:00 2001 From: Baptiste LAFOURCADE Date: Wed, 1 Jul 2026 18:20:18 +0200 Subject: [PATCH 2/3] fix(dev): make functional findings and verdict rules consistent Follow-up to the review-contract fixes, closing four consistency gaps: - Functional findings fill every Findings column (Location, Issue, Fix), not just phase and gap, so no row is blank or invented. - A functional `fix` row uses a blocking severity (critical or warning), never minor, so the header count always explains the verdict. - The router rule now states functional also appends its `fix` rows to Findings, matching the action. - The `blocked` rule carries the same tag exception as `approve`: a `not-applicable` or `fixed` criterion never blocks, whatever its severity. Co-Authored-By: Claude Opus 4.8 --- plugins/aidd-dev/skills/05-review/SKILL.md | 2 +- .../aidd-dev/skills/05-review/actions/02-review-functional.md | 2 +- plugins/aidd-dev/skills/05-review/references/review-rubric.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/plugins/aidd-dev/skills/05-review/SKILL.md b/plugins/aidd-dev/skills/05-review/SKILL.md index 69c5e89b..fd638425 100644 --- a/plugins/aidd-dev/skills/05-review/SKILL.md +++ b/plugins/aidd-dev/skills/05-review/SKILL.md @@ -24,7 +24,7 @@ Run all three by default, composing one report. Run a single axis only when the - Read-only: surface each finding with its fix described, never patch. - Output: always write `review.md` to disk; the file is the deliverable, never an inline-only verdict. - Folder: write into the reviewed work's feature folder (`aidd_docs/tasks//_/`, beside `plan.md`), or one resolved from the change when it has none. -- Report: fill `review.md` from `assets/review-template.md`. One shared `Findings` table for every axis: functional writes the `Phases` boxes and the `Verification` table, code and relevancy append rows to `Findings` under their `Kind`. Tables and boxes, no prose, no per-axis sections. The Phase column ties a finding to the plan when one is in scope, `-` otherwise. +- Report: fill `review.md` from `assets/review-template.md`. One shared `Findings` table for every axis: functional writes the `Phases` boxes and the `Verification` table, and appends a `functional` row to `Findings` for each unmet criterion tagged `fix`; code and relevancy append rows to `Findings` under their `Kind`. Tables and boxes, no prose, no per-axis sections. The Phase column ties a finding to the plan when one is in scope, `-` otherwise. - Sections: the report has exactly the sections in `assets/review-validator.yml`. Before returning, verify against it and remove any section not listed. - Not run: every required section always exists. An axis that did not run marks the sections it owns "Not run" (functional owns `Phases` and `Verification`); it never leaves a placeholder or invents data. - Re-run: overwrite `review.md` with the current review. It is a snapshot of the current diff, not a history; a later review of the same work replaces the earlier one. diff --git a/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md b/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md index 67628ffe..b48983a6 100644 --- a/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md +++ b/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md @@ -15,7 +15,7 @@ The `Phases` and `Verification` sections of the feature folder's `review.md` (on 1. **Read.** Take the plan from the arguments; if absent, ask for the acceptance criteria, and mark the `Phases` and `Verification` sections "Not run" when none are available. Static review only, no app execution or browser. 2. **Trace.** Fetch the diff, then walk each phase in plan order. For every acceptance criterion, check the box `[x]` only when the diff shows evidence, citing the `file:line`; leave it `[ ]` when unmet or partial, naming the gap. 3. **Summarize.** Fill `Verification`: the percent and count of checked criteria, the files checked, every unchecked criterion tagged `fix`, `not-applicable`, or `fixed`, and any unplanned change in the diff that traces to no criterion. Write "none" for an empty list. -4. **Count.** For each unmet criterion tagged `fix`, append a `functional` row to the `Findings` table (its Phase, the gap, a severity) so the header count matches the verdict. A `not-applicable` or `fixed` criterion adds no row. +4. **Count.** For each unmet criterion tagged `fix`, append a `functional` row to the `Findings` table with every column filled: Phase, Location (the phase's target file, or `-`), Issue (the unmet criterion and its gap), Fix (what to implement), and a blocking severity (🔴 critical or 🟡 warning, never 🟢 minor), so the header count matches the verdict. A `not-applicable` or `fixed` criterion adds no row. 5. **Record.** Write the phase blocks, the functional findings, and the verification summary into `review.md`. No prose paragraphs, boxes and the summary only. ## Test diff --git a/plugins/aidd-dev/skills/05-review/references/review-rubric.md b/plugins/aidd-dev/skills/05-review/references/review-rubric.md index 126eb829..400f6259 100644 --- a/plugins/aidd-dev/skills/05-review/references/review-rubric.md +++ b/plugins/aidd-dev/skills/05-review/references/review-rubric.md @@ -14,7 +14,7 @@ One overall verdict, the strictest across the axes run: - `approve`: no critical finding, and no criterion left unchecked and tagged `fix`, ship it. A `not-applicable` or `fixed` criterion does not block. - `changes-requested`: warnings, a fixable critical, or any unchecked criterion tagged `fix`. -- `blocked`: a critical that must not merge, or an unchecked critical criterion. +- `blocked`: a critical that must not merge, or an unchecked critical criterion tagged `fix`. A `not-applicable` or `fixed` criterion never blocks, whatever its severity. An unchecked criterion tagged `fix` is a functional finding: it appears as a `functional` row in `Findings` (so it counts) and cannot yield `approve`. A `not-applicable` criterion is neither a finding nor a blocker. From b157e082316923e976c513e6bfd7cc8145ac92df Mon Sep 17 00:00:00 2001 From: Baptiste LAFOURCADE Date: Wed, 1 Jul 2026 18:43:46 +0200 Subject: [PATCH 3/3] fix(dev): critical functional severity and skipped-axis visibility MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two more consistency gaps from review feedback: - A `fix` functional row uses 🔴 critical when the criterion itself is critical, otherwise 🟡 warning, so a blocked verdict never shows 0 critical. - The header carries `Axes run`, so a skipped code or relevancy axis is visible even though Findings is shared; a functional-only report no longer reads like a clean full review. Co-Authored-By: Claude Opus 4.8 --- plugins/aidd-dev/skills/05-review/SKILL.md | 2 +- .../aidd-dev/skills/05-review/actions/02-review-functional.md | 2 +- plugins/aidd-dev/skills/05-review/assets/review-template.md | 1 + 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/plugins/aidd-dev/skills/05-review/SKILL.md b/plugins/aidd-dev/skills/05-review/SKILL.md index fd638425..3c8f0fcb 100644 --- a/plugins/aidd-dev/skills/05-review/SKILL.md +++ b/plugins/aidd-dev/skills/05-review/SKILL.md @@ -26,7 +26,7 @@ Run all three by default, composing one report. Run a single axis only when the - Folder: write into the reviewed work's feature folder (`aidd_docs/tasks//_/`, beside `plan.md`), or one resolved from the change when it has none. - Report: fill `review.md` from `assets/review-template.md`. One shared `Findings` table for every axis: functional writes the `Phases` boxes and the `Verification` table, and appends a `functional` row to `Findings` for each unmet criterion tagged `fix`; code and relevancy append rows to `Findings` under their `Kind`. Tables and boxes, no prose, no per-axis sections. The Phase column ties a finding to the plan when one is in scope, `-` otherwise. - Sections: the report has exactly the sections in `assets/review-validator.yml`. Before returning, verify against it and remove any section not listed. -- Not run: every required section always exists. An axis that did not run marks the sections it owns "Not run" (functional owns `Phases` and `Verification`); it never leaves a placeholder or invents data. +- Not run: every required section always exists. The header `Axes run` lists which axes ran, so a skipped code or relevancy axis is visible even though `Findings` is shared. An axis that did not run marks the sections it owns "Not run" (functional owns `Phases` and `Verification`); it never leaves a placeholder or invents data. - Re-run: overwrite `review.md` with the current review. It is a snapshot of the current diff, not a history; a later review of the same work replaces the earlier one. - Verdict: one overall verdict, the strictest across the axes run, per `references/review-rubric.md`. diff --git a/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md b/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md index b48983a6..3c7ab48e 100644 --- a/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md +++ b/plugins/aidd-dev/skills/05-review/actions/02-review-functional.md @@ -15,7 +15,7 @@ The `Phases` and `Verification` sections of the feature folder's `review.md` (on 1. **Read.** Take the plan from the arguments; if absent, ask for the acceptance criteria, and mark the `Phases` and `Verification` sections "Not run" when none are available. Static review only, no app execution or browser. 2. **Trace.** Fetch the diff, then walk each phase in plan order. For every acceptance criterion, check the box `[x]` only when the diff shows evidence, citing the `file:line`; leave it `[ ]` when unmet or partial, naming the gap. 3. **Summarize.** Fill `Verification`: the percent and count of checked criteria, the files checked, every unchecked criterion tagged `fix`, `not-applicable`, or `fixed`, and any unplanned change in the diff that traces to no criterion. Write "none" for an empty list. -4. **Count.** For each unmet criterion tagged `fix`, append a `functional` row to the `Findings` table with every column filled: Phase, Location (the phase's target file, or `-`), Issue (the unmet criterion and its gap), Fix (what to implement), and a blocking severity (🔴 critical or 🟡 warning, never 🟢 minor), so the header count matches the verdict. A `not-applicable` or `fixed` criterion adds no row. +4. **Count.** For each unmet criterion tagged `fix`, append a `functional` row to the `Findings` table with every column filled: Phase, Location (the phase's target file, or `-`), Issue (the unmet criterion and its gap), Fix (what to implement), and a severity that matches the criterion (🔴 critical when the criterion itself is critical, otherwise 🟡 warning, never 🟢 minor), so the header count matches the verdict. A `not-applicable` or `fixed` criterion adds no row. 5. **Record.** Write the phase blocks, the functional findings, and the verification summary into `review.md`. No prose paragraphs, boxes and the summary only. ## Test diff --git a/plugins/aidd-dev/skills/05-review/assets/review-template.md b/plugins/aidd-dev/skills/05-review/assets/review-template.md index 6c5b674b..133e31f1 100644 --- a/plugins/aidd-dev/skills/05-review/assets/review-template.md +++ b/plugins/aidd-dev/skills/05-review/assets/review-template.md @@ -4,6 +4,7 @@ - **Verdict**: {{approve | changes-requested | blocked}} - **Diff**: `{{base}}...{{head}}` +- **Axes run**: {{the axes that ran, from code, functional, relevancy}} - **Date**: {{yyyy_mm_dd}} - **Findings**: {{n_critical}} critical, {{n_warning}} warning, {{n_minor}} minor