refactor(framework): SDLC anti-slop templates + strict review contract#376
Conversation
Templates now state a qualify-if bar per section and are a closed set: an instance fills or omits the defined sections and never invents, adds, renames, or reorders one (R9). - spec: done-when is observable conditions, not commands or gates. - plan: decisions are architecture-magnitude only, resources are external sources only, phase wireframe is UI-only, the phase->plan redirect is dropped, and the architecture-projection emoji legend is visible. - review: phases as checkboxes, one shared Findings table (Kind column), and a Verification table; drop per-axis sections, prose, and Follow-up. - review verdict: an unchecked criterion tagged fix cannot yield approve; the Unplanned (scope-creep) line is restored. - decouple the checker and 00-sdlc review from 05-review's report file and section names; they name only the skill, which owns its report. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
review.md sections are now fixed by assets/review-validator.yml (Phases, Findings, Verification). 05-review verifies its output against it before returning and removes any section the model added, replacing the passive template comment with a deterministic contract. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e51df8123f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| | Criterion | Met | Evidence / gap | | ||
| | --------- | --- | -------------- | | ||
| ### Phase {{n}} — {{phase-name}} |
There was a problem hiding this comment.
Allow phase headings in the validator
For reviews with a plan, the template now creates a ### Phase ... heading for each phase, but the new validator allows only Phases, Findings, and Verification, and SKILL.md tells the reviewer to remove any section not listed. If that final validation is followed literally, the required per-phase blocks are treated as extra sections and can be stripped from the report, losing the functional trace. Either whitelist phase subsections or make phase labels non-heading content under ## Phases.
Useful? React with 👍 / 👎.
| - Phases # required: one block per plan phase, a box per criterion | ||
| - Findings # required: one table, all axes, by Kind | ||
| - Verification # required: the summary table |
There was a problem hiding this comment.
Define required sections for single-axis reviews
The skill still supports running a single named axis, but this validator makes Phases and Verification required for every review.md. In a review-code or review-relevancy invocation the functional action never runs, so those sections have no owner and the report can be returned with stale placeholders or misleading functional data while still satisfying the closed section set. Add an explicit Not run contract for skipped axes or make these sections conditional on the functional axis running.
Useful? React with 👍 / 👎.
| @@ -1,3 +1,5 @@ | |||
| <!-- Fill or omit these sections; never add, rename, or reorder one. --> | |||
There was a problem hiding this comment.
Keep required spec sections from being omitted
This template-level instruction applies to all sections, including Hard constraints, Non-goals, and Done-when, which the validator still marks as required. For requests that do not provide one of those fields, the builder is supposed to keep the section and write a TBD question; the broad “fill or omit” guidance can instead produce a spec that omits a required section and immediately fails its own validator. Limit the omit rule here to optional sections or call out the required sections explicitly.
Useful? React with 👍 / 👎.
| | 01 | `spec` | Consolidate sources into the contract | a spec capability | | ||
| | 02 | `plan` | Produce the plan file | self, via `aidd-dev:01-plan` | | ||
| | 03 | `implement` | Build the plan's code | `executor`, via `aidd-dev:02-implement` | | ||
| | 03 | `implement` | Build the plan's code, gating on the assertions | `executor`, via `aidd-dev:02-implement` | |
There was a problem hiding this comment.
Add the explicit assert step before review
When plans now omit build/test/typecheck gates, this still delegates step 03 to aidd-dev:02-implement and the run order remains 01 → 02 → 03 → 04 → 05, so aidd-dev:03-assert and its coding-assertions.md sweep are never invoked as the standalone pre-review gate. The changed role text does not alter the delegate; add an explicit assert action/delegate between implement and review, including the failed-assert loop back to implement.
Useful? React with 👍 / 👎.
| | ------------- | ------------------------------------------------- | | ||
| | Verified | {{pct}}% ({{n_checked}}/{{n_total}}) | | ||
| | Files checked | {{files, comma-separated}} | | ||
| | Unchecked | {{criterion — fix / not-applicable / fixed}}, or none | |
There was a problem hiding this comment.
Preserve severity for unchecked criteria
For an unmet acceptance criterion that must not merge, the rubric says the verdict should be blocked for an unchecked critical criterion, but the new verification row only allows fix, not-applicable, or fixed for unchecked items. Because no severity can be recorded in Phases or Verification, the reviewer has no reportable way to distinguish a blocking functional gap from an ordinary fixable one, so such reviews can be downgraded to changes-requested.
Useful? React with 👍 / 👎.
| ## Findings | ||
|
|
||
| Does the change belong (or "Not run"). Every finding ties to evidence, never an opinion. | ||
| <!-- One table for every axis. Kind is `code`, `fit`, `conform`, or `rot`. Phase ties the row to the plan, or `-` with no plan. Each axis appends its own rows. Write "None." when a run found nothing. --> |
There was a problem hiding this comment.
Avoid per-axis None markers in shared findings
With the default all-axis run, one clean axis can write None. into the shared Findings table before another axis appends real rows, producing a report that simultaneously says there are no findings and lists findings. Since Findings is now global rather than per-axis, only emit a single empty-state after all axes complete, or encode clean axes as table rows that cannot be mistaken for the whole report state.
Useful? React with 👍 / 👎.
| The `Relevancy` section of the feature folder's `review.md`, filled with misfit findings under the lenses `fit`, `conform`, and `rot`, each tied to evidence. | ||
| The diff to review (a git ref range or path; defaults to the diff against the repository default branch), the need it serves (the plan objective or the ticket), the project's declared rules discovered at runtime, and the plan when in scope. | ||
|
|
||
| ## Process |
There was a problem hiding this comment.
Restore the relevancy output contract
This action is now the only action without the mandatory ## Output section required by the repository's skill-authoring action anatomy. In tools or generators that rely on that structure, the relevancy axis no longer has an explicit deliverable shape before ## Process, making it easier for the report composition to omit or misplace these findings. Add the output section back with the shared Findings row contract.
Useful? React with 👍 / 👎.
| | Verified | {{pct}}% ({{n_checked}}/{{n_total}}) | | ||
| | Files checked | {{files, comma-separated}} | | ||
| | Unchecked | {{criterion — fix / not-applicable / fixed}}, or none | | ||
| | Unplanned | {{change tracing to no criterion}}, or none | |
There was a problem hiding this comment.
Make unplanned changes affect the verdict
When a diff adds behavior outside the plan, this row records the scope creep, but the verdict rubric only prevents approve for critical findings or unchecked criteria and gives Unplanned no severity or fix status. A review can therefore approve a change where every listed criterion is checked but extra unplanned behavior is present. Give unplanned items a disposition/severity or record them as findings so they participate in the verdict.
Useful? React with 👍 / 👎.
| --- | ||
|
|
||
| # Instruction: {title} | ||
| <!-- Fill or omit these sections; never add, rename, or reorder one. --> |
There was a problem hiding this comment.
Keep phase criteria from being omitted
This broad fill-or-omit instruction covers Test acceptance criteria, even though downstream implementation and functional review both consume every phase's acceptance criteria as their gate. If a planner omits that section because no criterion appears to meet the new qualify bar, 02-implement has nothing to assert for the phase and 05-review has nothing to trace. Limit omission to optional sections such as Wireframe, or require criteria/TBD entries for each phase.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e51df8123f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| | Criterion | Met | Evidence / gap | | ||
| | --------- | --- | -------------- | | ||
| ### Phase {{n}} — {{phase-name}} |
There was a problem hiding this comment.
Allow phase headings in the validator
For reviews with a plan, the template now creates a ### Phase ... heading for each phase, but the new validator allows only Phases, Findings, and Verification, and SKILL.md tells the reviewer to remove any section not listed. If that final validation is followed literally, the required per-phase blocks are treated as extra sections and can be stripped from the report, losing the functional trace. Either whitelist phase subsections or make phase labels non-heading content under ## Phases.
Useful? React with 👍 / 👎.
| - Phases # required: one block per plan phase, a box per criterion | ||
| - Findings # required: one table, all axes, by Kind | ||
| - Verification # required: the summary table |
There was a problem hiding this comment.
Define required sections for single-axis reviews
The skill still supports running a single named axis, but this validator makes Phases and Verification required for every review.md. In a review-code or review-relevancy invocation the functional action never runs, so those sections have no owner and the report can be returned with stale placeholders or misleading functional data while still satisfying the closed section set. Add an explicit Not run contract for skipped axes or make these sections conditional on the functional axis running.
Useful? React with 👍 / 👎.
| @@ -1,3 +1,5 @@ | |||
| <!-- Fill or omit these sections; never add, rename, or reorder one. --> | |||
There was a problem hiding this comment.
Keep required spec sections from being omitted
This template-level instruction applies to all sections, including Hard constraints, Non-goals, and Done-when, which the validator still marks as required. For requests that do not provide one of those fields, the builder is supposed to keep the section and write a TBD question; the broad “fill or omit” guidance can instead produce a spec that omits a required section and immediately fails its own validator. Limit the omit rule here to optional sections or call out the required sections explicitly.
Useful? React with 👍 / 👎.
| | 01 | `spec` | Consolidate sources into the contract | a spec capability | | ||
| | 02 | `plan` | Produce the plan file | self, via `aidd-dev:01-plan` | | ||
| | 03 | `implement` | Build the plan's code | `executor`, via `aidd-dev:02-implement` | | ||
| | 03 | `implement` | Build the plan's code, gating on the assertions | `executor`, via `aidd-dev:02-implement` | |
There was a problem hiding this comment.
Add the explicit assert step before review
When plans now omit build/test/typecheck gates, this still delegates step 03 to aidd-dev:02-implement and the run order remains 01 → 02 → 03 → 04 → 05, so aidd-dev:03-assert and its coding-assertions.md sweep are never invoked as the standalone pre-review gate. The changed role text does not alter the delegate; add an explicit assert action/delegate between implement and review, including the failed-assert loop back to implement.
Useful? React with 👍 / 👎.
| | ------------- | ------------------------------------------------- | | ||
| | Verified | {{pct}}% ({{n_checked}}/{{n_total}}) | | ||
| | Files checked | {{files, comma-separated}} | | ||
| | Unchecked | {{criterion — fix / not-applicable / fixed}}, or none | |
There was a problem hiding this comment.
Preserve severity for unchecked criteria
For an unmet acceptance criterion that must not merge, the rubric says the verdict should be blocked for an unchecked critical criterion, but the new verification row only allows fix, not-applicable, or fixed for unchecked items. Because no severity can be recorded in Phases or Verification, the reviewer has no reportable way to distinguish a blocking functional gap from an ordinary fixable one, so such reviews can be downgraded to changes-requested.
Useful? React with 👍 / 👎.
| ## Findings | ||
|
|
||
| Does the change belong (or "Not run"). Every finding ties to evidence, never an opinion. | ||
| <!-- One table for every axis. Kind is `code`, `fit`, `conform`, or `rot`. Phase ties the row to the plan, or `-` with no plan. Each axis appends its own rows. Write "None." when a run found nothing. --> |
There was a problem hiding this comment.
Avoid per-axis None markers in shared findings
With the default all-axis run, one clean axis can write None. into the shared Findings table before another axis appends real rows, producing a report that simultaneously says there are no findings and lists findings. Since Findings is now global rather than per-axis, only emit a single empty-state after all axes complete, or encode clean axes as table rows that cannot be mistaken for the whole report state.
Useful? React with 👍 / 👎.
| The `Relevancy` section of the feature folder's `review.md`, filled with misfit findings under the lenses `fit`, `conform`, and `rot`, each tied to evidence. | ||
| The diff to review (a git ref range or path; defaults to the diff against the repository default branch), the need it serves (the plan objective or the ticket), the project's declared rules discovered at runtime, and the plan when in scope. | ||
|
|
||
| ## Process |
There was a problem hiding this comment.
Restore the relevancy output contract
This action is now the only action without the mandatory ## Output section required by the repository's skill-authoring action anatomy. In tools or generators that rely on that structure, the relevancy axis no longer has an explicit deliverable shape before ## Process, making it easier for the report composition to omit or misplace these findings. Add the output section back with the shared Findings row contract.
Useful? React with 👍 / 👎.
| | Verified | {{pct}}% ({{n_checked}}/{{n_total}}) | | ||
| | Files checked | {{files, comma-separated}} | | ||
| | Unchecked | {{criterion — fix / not-applicable / fixed}}, or none | | ||
| | Unplanned | {{change tracing to no criterion}}, or none | |
There was a problem hiding this comment.
Make unplanned changes affect the verdict
When a diff adds behavior outside the plan, this row records the scope creep, but the verdict rubric only prevents approve for critical findings or unchecked criteria and gives Unplanned no severity or fix status. A review can therefore approve a change where every listed criterion is checked but extra unplanned behavior is present. Give unplanned items a disposition/severity or record them as findings so they participate in the verdict.
Useful? React with 👍 / 👎.
| --- | ||
|
|
||
| # Instruction: {title} | ||
| <!-- Fill or omit these sections; never add, rename, or reorder one. --> |
There was a problem hiding this comment.
Keep phase criteria from being omitted
This broad fill-or-omit instruction covers Test acceptance criteria, even though downstream implementation and functional review both consume every phase's acceptance criteria as their gate. If a planner omits that section because no criterion appears to meet the new qualify bar, 02-implement has nothing to assert for the phase and 05-review has nothing to trace. Limit omission to optional sections such as Wireframe, or require criteria/TBD entries for each phase.
Useful? React with 👍 / 👎.
What
Reshape the SDLC artifacts (spec, plan, phase, review) so they stop emitting slop and hold a strict, skimmable output contract.
skill-authoring.mdR9.Findingstable (Kind column) + aVerificationtable. No per-axis sections, no prose, no Follow-up.fixcannot yieldapprove; the Unplanned (scope-creep) line is restored.review-validator.ymlfixes the exact section set; 05-review verifies its output and removes any section the model added.Test evidence
Dogfooded on the aidd CLI through headless SDLC runs:
## Notessection was rewritten without it, the checker citing the validator as the reason.00-sdlcauto completes end to end: 6 commits per phase, plan status reachesreviewed, tree clean, no push.🤖 Generated with Claude Code