feat(spec): raise completeness and quality of /awos:spec output#140
feat(spec): raise completeness and quality of /awos:spec output#140FlySpot wants to merge 2 commits into
Conversation
Applies all spec-command-optimization changes validated in the awos-qa fork (PRs #1–#3 on feat/spec-command-optimization): - Step 3: add boundary/error elicitation rule (item 2) and require at least one failure-path AC for boundary requirements (item 3) - Step 4: extend self-review to catch vague/unmeasurable wording in requirements and acceptance criteria, with concrete weasel-word examples and the user-perceivable-terms resolution rule - Step 5 (new): Specification Definition of Done — pre-write checklist that gates on no vague wording, every requirement having ≥1 AC, and no unresolved [NEEDS CLARIFICATION] markers; unattended runs replace markers with **Assumption:** lines and still save the file - Steps 5→6, 6→7: renumber Final Review and File Generation to accommodate the new DoD step - tests: add three lint checks pinning the DoD checklist, ambiguity self-review, and boundary/error rules so contracts cannot be silently dropped Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Records the problem statement, fix direction, placement, and scope boundary for each of the seven optimization points (Points 3 and 6 explicitly carved out as requiring coordinated changes across other commands). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughUpdates ChangesSpec Prompt Guidance
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
AlexanderMakarov
left a comment
There was a problem hiding this comment.
This tightens /awos:spec in sensible places: it makes boundary/error behavior an explicit elicitation rule rather than a buried example, requires a failure-path acceptance criterion for those requirements, adds a weasel-word pass to the self-review, and closes the loop with a pre-write Definition of Done that sweeps [NEEDS CLARIFICATION] markers. The prompt edits read clean — the Step 4↔Step 5 cross-references line up, the marker-sweep "runs last so it also catches markers from items 1 and 2" ordering is thought through, the old-Step-5/6 → 6/7 renumbering is applied consistently, and the wording stays plain-declarative (one bold emphasis per rule) per the repo's emphasis convention. Nothing in the actual command blocks merge.
What is docs/optimization/spec-command-optimization.md for? It's a new pattern (no docs/optimization/ today) and it isn't shipped by the installer, so this is purely about the repo's own docs. After merge, five of its seven points (1, 2, 4, 5, 7) are implemented — the doc then describes problems that no longer exist, in the present tense. The repo's own rule is "docs describe the present, not the history." The genuinely live content is Points 3 and 6 (the deferred cross-command work). Worth deciding: keep it as a durable record (then it wants section/step references instead of line numbers, and past-tense problem statements), or trim it to the deferred tasks and let the PR description carry the rest? Your call — I just don't want it to rot silently.
| 'commands/spec.md Definition of Done must tell unattended runs to replace each unresolved [NEEDS CLARIFICATION] marker with a visible **Assumption:** and still save the file' | ||
| ); | ||
| assert.ok( | ||
| /at least one acceptance criterion/i.test(body), |
There was a problem hiding this comment.
This assertion is meant to pin DoD item 2 ("Every requirement has at least one acceptance criterion"), but /at least one acceptance criterion/i also matches the Step 3 failure-path rule you added at spec.md:99 ("include at least one acceptance criterion covering the failure path"). So if someone deleted the DoD completeness gate but left the Step 3 rule, this stays green — the item it names isn't actually covered. The other three assertions in this test (Definition of Done, No vague wording remains, **Assumption:**) are unique, so the section as a whole is pinned; it's just this one item that slips through. Anchoring on the unique bolded lead-in would fix it:
assert.ok(
body.includes('Every requirement has at least one acceptance criterion'),
'...'
);| 'commands/spec.md self-review must name concrete weasel-word examples (e.g. "user-friendly") so the model knows what to hunt' | ||
| ); | ||
| assert.ok( | ||
| /concrete in user-perceivable terms/i.test(body), |
There was a problem hiding this comment.
Minor, optional — same flavor as the DoD one but not load-bearing: /concrete in user-perceivable terms/i matches both the Step 4 pass (spec.md:111) and DoD item 1 (spec.md:117), so on its own it wouldn't catch Step 4 being removed. The sibling assertion /vague or unmeasurable wording/i is unique to Step 4, so the test still does its job. Leave it or tighten to a Step-4-only phrase (e.g. the search results appear within 2 seconds example) — your preference.
|
|
||
| ## Point 1 — Unresolved `[NEEDS CLARIFICATION]` markers reaching the saved file | ||
|
|
||
| - **Problem:** The marker is only required to be closed on the parent requirement before acceptance criteria are written — `commands/spec.md:96`: _"If any `[NEEDS CLARIFICATION: …]` markers remain on the parent requirement in §Functional Requirements, ask clarifying questions and resolve the markers before writing acceptance criteria."_ Neither the self-review (Step 4, line 108) nor file generation (Step 6, lines 114–119) blocks saving a file with markers still open in other sections. The marker is introduced at `commands/spec.md:90`: _"mark every unresolved detail with `[NEEDS CLARIFICATION: your specific question]`"_. |
There was a problem hiding this comment.
The commands/spec.md:NN citations here are accurate against main as it stands, but this PR's own edits shift every one of them down (~+2 lines) and renumber the steps — so on merge they point at the wrong lines, and "file generation (Step 6, lines 114–119)" names a step that's now Step 7. Same root shows in Point 4 (line 30): "the self-review catches only leakage … weasel words [are] not present in spec.md today" reads in the present tense, but this PR is exactly what makes that false. If the doc stays, citing by section/step name and writing the problem statements in past tense ("before this change…") keeps it from contradicting the file it ships with.
| ### Step 4: Self-Review (Language and Ambiguity Check) | ||
|
|
||
| - Before presenting to the user, re-read the entire draft end-to-end. For every sentence, ask: "Would this make sense to someone who has never seen the codebase?" Replace any developer-facing language with plain, non-technical wording in the same language the user is using. Remove any references to internal system behavior, code, or architecture that slipped in. | ||
| - Then re-read §Functional Requirements — both the requirement statements and their acceptance criteria — for vague or unmeasurable wording: words like "fast", "user-friendly", or "as appropriate" that a tester could not verify. Make each one concrete in user-perceivable terms (e.g., "the search feels fast" becomes "search results appear within 2 seconds"), or, if the user has not decided the specific value yet, replace it with a `[NEEDS CLARIFICATION: …]` marker so it is resolved in Step 5. Quantify only where the user would notice the difference — do not force a number onto every sentence; narrative sections like §Overview and Rationale may stay qualitative. |
There was a problem hiding this comment.
like §Overview and Rationale may stay qualitative.
Weak reference - I don't understand where name of section ends. The same for re-read §Functional Requirements — both.
Please use [Link Text](#section-name) formatting. Here and below in this file.
|
|
||
| Before the final review and saving the file, check the draft against this list. It is a self-review, not a new approval gate — the file is still written at the end of the process. | ||
|
|
||
| 1. **No vague wording remains in requirements or acceptance criteria.** Each requirement and criterion is concrete enough that a non-developer can tell unambiguously whether it passes or fails. If a vague term slipped through, resolve it as described in Step 4: make it concrete in user-perceivable terms, or convert it to a `[NEEDS CLARIFICATION: …]` marker — the marker is a temporary hand-off, not a final state; item 3 closes every marker before the file is written. |
There was a problem hiding this comment.
No vague wording remains in requirements or acceptance criteria.
I guess it is excessive - big part of step 4 was dedicated to this check. If we make LLM to check itself in the same session (the same as asking person "Are you done?") it rarely works - LLM reads from it's own context that everything is completed already and do nothing (as the person answers "Yes" without checking, from memory). So this usually just a waste of tokens.
We may ask it to Grep in the specific file specific word instead - item 3 below.
|
|
||
| 1. **No vague wording remains in requirements or acceptance criteria.** Each requirement and criterion is concrete enough that a non-developer can tell unambiguously whether it passes or fails. If a vague term slipped through, resolve it as described in Step 4: make it concrete in user-perceivable terms, or convert it to a `[NEEDS CLARIFICATION: …]` marker — the marker is a temporary hand-off, not a final state; item 3 closes every marker before the file is written. | ||
| 2. **Every requirement has at least one acceptance criterion.** Confirm that each functional requirement in §Functional Requirements carries at least one acceptance criterion in the When/Then shape (Given optional). If a requirement has none, write one for it before saving. | ||
| 3. **No unresolved clarification markers.** Scan the whole document for `[NEEDS CLARIFICATION: …]` markers, not only the section you edited last — this check runs last so it also catches markers created by items 1 and 2. In an interactive session, resolve each one with the user and fold the answer into the relevant requirement and its acceptance criteria. In an unattended session where no one can answer, **do not stop**: replace each marker with a visible `**Assumption:** …` line in plain language and save the file anyway, so the open decision is recorded rather than hidden. Example: `The user sees an error message. [NEEDS CLARIFICATION: exact wording?]` becomes `The user sees an error message. **Assumption:** the message reads "Something went wrong — please try again."` |
There was a problem hiding this comment.
In an interactive session, resolve each one with the user
Please be more precise - like using AskUserQuestion tool with reasonable defaults - it would make LLM understand (and, therefore, implement) task better and be proactive for user (don't invent units for metrics).
| @@ -0,0 +1,51 @@ | |||
| # Spec Command Optimization | |||
There was a problem hiding this comment.
TL;DR: better to move this file into separate PR. I don't see how it is used here.
What
Applies a set of targeted improvements to `/awos:spec` that close five quality gaps identified in the optimization plan. All changes were validated end-to-end in the awos-qa fork before being ported here.
Changes to `commands/spec.md`
Step 3 — Functional Requirements & Acceptance Criteria (Points 2)
Step 4 — Self-Review (Point 4)
[NEEDS CLARIFICATION: …]marker.Step 5 — Specification Definition of Done (Points 1, 5, 7) — new step
[NEEDS CLARIFICATION: …]markers. In unattended runs, each marker is replaced with a visible**Assumption:** …line so open decisions are recorded rather than hidden.New file: `docs/optimization/spec-command-optimization.md`
Documents the problem statement, fix direction, and placement for all seven optimization points, including the rationale for carving out Points 3 (non-functional requirements) and 6 (relative-importance signal) as separate cross-command tasks.
Why
Before this change:
[NEEDS CLARIFICATION]markers were resolved.Out of scope (separate tasks)
Tests
Three new lint checks in `tests/lint-prompts.test.js` pin the contracts so they cannot be silently dropped:
Definition of Done,**Assumption:**,at least one acceptance criterion,No vague wording remains)vague or unmeasurable wording,"user-friendly",concrete in user-perceivable terms)boundary and error behavior the user sees,at least one acceptance criterion covering the failure path)Behavioral coverage (no residual markers in produced `functional-spec.md`, every requirement carries a criterion, no unverifiable wording) is in the awos-qa repository.
Summary by CodeRabbit
Documentation
Tests