Draft: feat: add /awos:regression command by FlySpot · Pull Request #114 · provectus/awos

FlySpot · 2026-05-07T18:25:05Z

Summary

/awos:regression — new command for managing the long-term regression suite. After a feature's Testing & Regression slice is complete, it: extracts test candidates from annotated test files (@spec, @regression), deduplicates against the existing suite, asks the user to confirm the selection, updates context/qa/regression-suite.md, optionally runs the suite, and generates a dated report at context/qa/regression-reports/regression-YYYY-MM-DD-[spec].md.
templates/regression-suite-template.md — starter template for context/qa/regression-suite.md, used on first run if the file doesn't exist.
commands/tasks.md — uncomments the /awos:regression sub-task inside the Feature Testing & Regression slice (was commented out in PR feat: Feature Testing & Regression slice, QA audit command, and verification hardening #109 pending this merge).

Merge order

Merge after PR #109. This PR's commands/tasks.md includes all changes from #109 plus the enabled regression sub-task.

File map

commands/regression.md          ← /awos:regression command
claude/commands/regression.md   ← thin Claude Code wrapper
templates/
  regression-suite-template.md  ← regression suite scaffold
commands/tasks.md               ← /awos:regression sub-task enabled

Test plan

Complete all implementation slices + Feature Testing & Regression slice on a spec
Run /awos:regression [spec-name] — confirm candidates are extracted from annotated test files
Confirm deduplication works against an existing regression-suite.md
Confirm regression-suite.md is updated after user approval
Confirm dated report is generated in context/qa/regression-reports/
Run without argument — confirm auto-detection finds the completed spec

🤖 Generated with Claude Code

…on-suite template Introduces the Regression Suite Manager command that promotes feature tests to the long-term regression suite, deduplicates entries, optionally runs the suite, and generates a dated report. Note: commands/tasks.md reference to /awos:regression is commented out in feat/qa-pyramid-agent — uncomment after that branch merges. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… Regression slice Now that the regression command is in this branch, the sub-task calling /awos:regression is active. Merges after feat/qa-pyramid-agent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-05-07T18:25:19Z

📝 Walkthrough

Walkthrough

Adds a regression-suite manager: new /awos:regression command, a regression-suite template, candidate extraction/deduplication with user confirmation, optional test execution and dated reporting, and task workflow updates enforcing artifact cleanup and the regression slice template.

Changes

Regression Suite Manager Feature

Layer / File(s)	Summary
Command Declaration `claude/commands/regression.md`	`/awos:regression` command metadata with directives to use `AskUserQuestion` for user interactions and reference canonical instructions.
Regression Suite Template `templates/regression-suite-template.md`	Template defines header metadata (`Last updated`, total test count), spec subsection placeholders, fixed table schemas for Unit/Integration/E2E/Contract, and allowed `Status`/`Polarity` values.
Regression Suite Manager Procedure `commands/regression.md`	Procedure: inputs/outputs, spec selection (explicit or auto-detect), candidate extraction (prefer `@spec`+`@regression`, fallback to `tasks.md`), duplicate detection/classification (DUPLICATE/EXTEND/NEW), user confirmation, suite update rules (add/extend/skip, refresh totals/Last updated), optional test-run with runner detection, result collection, and dated report generation; includes operational constraints (never write tests, never auto-delete entries, require confirmation, mark missing paths as pending discovery).
Vertical Slice Integration `commands/tasks.md`	Require artifact deletion after verification for slices (except Feature Testing & Regression), add per-slice cleanup sub-task, enforce exact Feature Testing & Regression slice template (including `testing-expert` and `/awos:regression [spec-directory-name]` usage), and update example slices accordingly.

Sequence Diagram

sequenceDiagram
  participant User
  participant Command as /awos:regression
  participant TaskFile as tasks.md
  participant SuiteFile as regression-suite.md
  participant Runner as Test Runner
  participant Report as Regression Report

  User->>Command: Invoke /awos:regression [spec]
  Command->>TaskFile: Extract test candidates (annotations or fallback)
  Command->>SuiteFile: Detect existing entries (duplicates/extensions)
  Command->>User: Confirm changes (proceed/review/cancel)
  User->>Command: Approve updates
  Command->>SuiteFile: Update suite (add/merge/skip entries)
  Command->>SuiteFile: Update metadata (Last updated, totals)
  Command->>User: Ask to run tests
  User->>Command: Confirm or skip execution
  Command->>Runner: Execute tests (full or new-only)
  Runner->>Report: Generate dated report (results + recommendations)
  Report->>User: Return report

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A regression suite is born today,
with @spec tags showing the way,
duplicates caught, extensions penned,
reports dated, fixes to append,
cleanup tidy — the meadow hops along!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The title accurately describes the main change: introducing a new /awos:regression command with supporting documentation and templates.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/regression

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@commands/regression.md`:
- Around line 55-66: The guidance in Step 2 and the later promotion step
contradicts: ensure the command prioritizes annotated test files first and only
uses context/spec/[target-spec]/tasks.md when no `@spec/`@regression annotations
are found; specifically modify the promotion logic referenced at "Line 230" to
conditionally promote entries from tasks.md only when the initial annotated-file
search returns zero candidates, and update the related text to state this
explicit fallback behavior and that promoted entries are marked "pending
discovery."
- Around line 69-75: The fenced code blocks in the document (notably the
Markdown table and other sample blocks referenced in the review) are missing
language identifiers; update each opening fence to include a language tag (e.g.,
change ``` to ```markdown) for the table block and the other fenced blocks
starting around the sample sections so the linter MD040 is satisfied, ensuring
every triple-backtick has a language specifier.

In `@commands/tasks.md`:
- Line 130: The bold/escape is malformed in the regression example text: replace
the accidental escaped closing bold marker `agent.\*\*` with a proper closing
bold marker so the sentence reads using normal Markdown bolding, e.g. `>
**Requires \`testing-expert\` agent.** If necessary, ensure the surrounding
backticks around testing-expert remain and both opening and closing `**` are
present to render the emphasis correctly (locate the string that starts with `>
**Requires` in the tasks example).
- Around line 72-74: The markdown fenced code blocks that contain checklist
items (e.g., the block starting with "- [ ] Cleanup: Delete any screenshots,
videos, or e2e scripts generated during this slice's verification. **[Agent:
general-purpose]**" and the subsequent fenced block covering lines 83–103) lack
language identifiers and trigger MD040; update the opening fences to include an
explicit language (for example, use ```markdown) for both fenced blocks so
linting passes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: bb87aaa5-ee6f-4fba-93c9-c2cc1f8eb8cc

📥 Commits

Reviewing files that changed from the base of the PR and between b749393 and 544797a.

📒 Files selected for processing (2)

commands/regression.md
commands/tasks.md

- Add language tags to fenced code blocks in regression.md and tasks.md (MD040) - Fix malformed bold escape in tasks.md example (agent.\*\* → agent.**) - Clarify Step 2 fallback priority in regression.md: annotated files first, tasks.md only when zero annotations found Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

commands/regression.md (1)

230-230: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unify candidate-source rule with Step 2 fallback logic

Line 230 contradicts Step 2: the flow says “annotations first, tasks.md only as fallback,” but this line says promotion is only from tasks.md. This can cause incorrect command behavior.

Suggested wording fix

-- Never write new tests — only promote existing ones from tasks.md.
+- Never write new tests — only promote existing tests discovered from annotated test files (`@spec` + `@regression`), or from `tasks.md` only when annotation discovery returns zero candidates.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@commands/regression.md` at line 230, Update the candidate-source rule text so
it matches Step 2's fallback logic: replace the sentence "Never write new tests
— only promote existing ones from tasks.md." with a unified rule that prefers
annotations as the primary source for test candidates and only uses tasks.md as
a fallback; ensure references to "candidate-source", "Step 2", "annotations",
and "tasks.md" are consistent so the doc clearly states "use annotations first;
if no annotations exist, promote existing tests from tasks.md."

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@commands/regression.md`:
- Line 27: Update the wording so the "Empty = auto-detect the most recently
completed spec (all tasks ✅, Status Completed)" promise matches Step 1 behavior:
either add a deterministic recency rule to Step 1 (e.g., "When multiple
completed specs exist, choose the one with the latest completed_at timestamp")
or change the Line 27 phrase to a relaxed form like "auto-detect a recently
completed spec" and apply the same change to the similar block covering lines
44-51; reference the "Empty = auto-detect..." line and the Step 1 selection
description when making the edit.

---

Duplicate comments:
In `@commands/regression.md`:
- Line 230: Update the candidate-source rule text so it matches Step 2's
fallback logic: replace the sentence "Never write new tests — only promote
existing ones from tasks.md." with a unified rule that prefers annotations as
the primary source for test candidates and only uses tasks.md as a fallback;
ensure references to "candidate-source", "Step 2", "annotations", and "tasks.md"
are consistent so the doc clearly states "use annotations first; if no
annotations exist, promote existing tests from tasks.md."

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: a1ae317f-75c3-4d05-842e-97517e69e569

📥 Commits

Reviewing files that changed from the base of the PR and between 544797a and 285bdba.

📒 Files selected for processing (2)

commands/regression.md
commands/tasks.md

🚧 Files skipped from review as they are similar to previous changes (1)

commands/tasks.md

coderabbitai · 2026-05-11T08:22:56Z

+# INPUTS & OUTPUTS
+
+- **User Prompt (Optional):** <user_prompt>$ARGUMENTS</user_prompt>
+  - Empty = auto-detect the most recently completed spec (all tasks ✅, Status Completed)


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align “most recently completed spec” promise with Step 1 selection behavior

Line 27 promises automatic selection of the most recently completed spec, but Step 1 currently asks the user to choose when multiple candidates exist and doesn’t define recency sorting. Please either define a deterministic recency rule in Step 1 or relax the wording in Line 27.

Suggested wording fix (minimal)

-- Empty = auto-detect the most recently completed spec (all tasks ✅, Status Completed) +- Empty = auto-detect a completed spec candidate (all tasks ✅, Status Completed); if multiple are found, ask the user to choose

Also applies to: 44-51

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@commands/regression.md` at line 27, Update the wording so the "Empty = auto-detect the most recently completed spec (all tasks ✅, Status Completed)" promise matches Step 1 behavior: either add a deterministic recency rule to Step 1 (e.g., "When multiple completed specs exist, choose the one with the latest completed_at timestamp") or change the Line 27 phrase to a relaxed form like "auto-detect a recently completed spec" and apply the same change to the similar block covering lines 44-51; reference the "Empty = auto-detect..." line and the Step 1 selection description when making the edit.

AndreyNenashev · 2026-05-11T12:45:52Z

+    - `[ ] **Slice 3: Feature Testing & Regression**`
+      - `> Verifies the complete feature works end-to-end as described in functional-spec.md.`
+      - `> Run AFTER all implementation slices are complete.`
+      - `> **Requires \`testing-expert\` agent.\*\* If it is not present in \`.claude/agents/\`, stop and run \`/awos:hire\` before executing this slice.`


Slice is something that covers some functionality implementation. Something that can be commited. It feels like "regression" command should be similar to "verify". What do you think?

dustyo-O · 2026-05-21T14:12:50Z

+# INPUTS & OUTPUTS
+
+- **User Prompt (Optional):** <user_prompt>$ARGUMENTS</user_prompt>
+  - Empty = auto-detect the most recently completed spec (all tasks ✅, Status Completed)


get most recent spec from the diff or lookup the commits -- which feature was added most recently

dustyo-O · 2026-05-21T14:17:07Z

+## Step 1: Identify target spec
+
+1. Read `<user_prompt>`. If it names a spec, use that directory.
+2. If empty, scan `context/spec/*/tasks.md` files. Find the spec where:


I also suggest to lookup log on context/spec, find most recent commit and search what exact feature was added there

dustyo-O · 2026-05-21T14:21:45Z

+- **File** — the test file path (already known from the search)
+- **Test Name** — the test function name
+
+**Fallback (only if primary source returns zero results):** Read `context/spec/[target-spec]/tasks.md`. Find the "Feature Testing & Regression" slice and list each `**[Agent: testing-expert]**` sub-task as a single candidate entry, marking Layer/Behavior/Polarity as "pending discovery". Inform the user that annotations were not found and entries are marked for future discovery.


I just confused that we wire command on the single agent to use. What if we use this command on the brownfield with existing tests? OR, in some particular project test are written by client-side specialists? With this pipeline, we have no value for that cases. Isn't that reliable to map existing feature specs on the test to detect which are primary for regression?

dustyo-O · 2026-05-21T14:28:02Z

+
+### Integration
+
+...


that could confuse llm what kind of information is expected here, it's better to provide template even if it's copy-paste

dustyo-O · 2026-05-21T14:28:27Z

+
+| File               | Test Name          | Behavior                       | Polarity | Status | Notes |
+| ------------------ | ------------------ | ------------------------------ | -------- | ------ | ----- |
+| tests/test_auth.py | test_token_payload | token payload, expiry, signing | positive | OK     | —     |


what exactly can came to Notes? what is the value of this field?

dustyo-O · 2026-05-21T14:33:25Z

+3. If user chooses to run:
+   - Detect test runner: check for `docker-compose.yml`, `Makefile` (with `test` target), `package.json` (`test` script), `pytest.ini` / `pyproject.toml`, `justfile`.
+   - If runner found: spin up infrastructure if needed, run the selected tests, capture output.
+   - If NO runner found: inform user — "No test runner detected. Tests are saved in regression-suite.md. Run them manually using your project's test command." Proceed to Step 7.


isn't that makes sense to ask user how tests are supposed to run, and if user provides that information, store it somewhere in MD, if user explicitly answers that tests run manually, store that also to avoid detection loop further

dustyo-O · 2026-05-21T14:37:03Z

-  8.  For each slice's verification sub-task, identify required MCPs/services (browser MCP, curl, database access, etc.) and note any that may be missing.
+  5.  After the verification sub-task, add a cleanup sub-task as the last item of the slice:
+      ```md
+      - [ ] Cleanup: Delete any screenshots, videos, or e2e scripts generated during this slice's verification. **[Agent: general-purpose]**


this might be likely lost after compact on one hand and have some value on debugging on another, consider creating gitignored folder for the artifact on the first run

"e2e scripts" is a little bit scary broad definition

- hire.md: restructure complementary pairs to search-first pattern with testing-expert as fallback; remove duplicate Python entry; replace playwright MCP with playwright CLI - qa.md: update description to "QA health check"; add user confirmation for full-scope audit; make architecture.md required with warning; rewrite Step 3 to reflect list-of-tests.md is maintained by testing-expert, not created by /awos:qa; implement risk-based gap analysis in Step 5 (two-pass: project-level + per-AC); implement coordinator pattern in Step 6 (specialist agent writes, testing-expert validates and updates registry); add guard for missing functional-spec; remove regression suite steps (moved to PR #114); add staleness and delta-coverage notes to TODO Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The command had no active Claude Code wrapper and its functionality is covered by two dedicated commands: - testing-expert agent (proactive test writing via /awos:tasks) - /awos:regression (regression suite management, PR #114) Retroactive staleness/gap auditing can be revisited as a plugin if needed in the future. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Two cleanups to the emitted slice, from testing the branch on a real spec: - Collapse the three blockquote note lines into one description. The "QA agent for this slice: `{agent}` (selected from .claude/agents/ and the Agent tool's description block)" line leaked the internal selection rationale into the generated artifact — the `**[Agent: ...]**` markers already say who runs the slice. - Drop the `` block. Its trigger lives in a different repo (the feat/regression PR), so it can't be tracked from here and just sits as dead commentary in every user's tasks.md. The `/awos:regression` wiring belongs in PR #114, which owns that command.

…ication hardening (#109) * feat: add testing-expert agent for test pyramid generation * fix: address code quality issues in testing-expert agent * feat: extend /awos:tasks to generate test pyramid tasks per vertical slice * fix: split positive/negative test examples into separate tasks in tasks.md * feat: add /awos:qa optional full-audit command * fix: restore testing-expert cross-reference and improve qa.md clarity * feat: add QA context templates for test registry and regression suite * docs: add QA pyramid agent implementation plan * chore: remove unnecessary testing-expert command wrapper — agent is internal-only * docs: remove testing-expert wrapper entry from plan file table * fix: resolve contradictions in testing-expert and qa commands * docs(qa): add TODO section with known limitations Documents three open gaps discovered during end-to-end testing: ephemeral E2E artifacts, coverage-by-inspection vs measurement, and no regression baseline/enforcement mechanism. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Delete docs/superpowers/plans/2026-04-07-qa-pyramid-agent.md * chore: fix prettier formatting across markdown files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tasks): sequential numbering, opt-out tests, ignore docs/superpowers - Renumber thought process steps sequentially (3b → 5, shift 5-8 → 6-9) - Make test generation opt-out instead of REQUIRED - Add docs/superpowers/ to .gitignore and untrack pushed files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tasks): sequential numbering and opt-out test generation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(testing-expert): move from commands/ to plugins/awos/agents/ * fix(tasks): replace vague 'planning mode' with explicit Agent tool invocation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tasks): clarify Agent tool context-passing and move opt-out check first * fix(tasks): rebalance examples to show layer judgment and consistent test coverage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(testing-expert): replace caller-based mode detection with condition-based * fix(tasks): add missing verify sub-tasks to JWT and avatar Slice 1 examples * fix(qa): remove stale testing-expert cross-reference, clarify e2e-tester, add regression-suite template init * fix: clean up garbled e2e-tester TODO and rename mode headings to match condition-based detection * fix: update stale 'execution mode' reference and restore step 4 formatting * fix(testing-expert): remove remaining caller-specific references from role description and output format * fix(tasks): use Task tool instead of Agent tool for testing-expert invocation * fix(installer): deploy testing-expert agent to .claude/agents/ during setup * fix(tasks): move Verify step after test sub-tasks in Slice 2 avatar example * fix(testing-expert): normalize positive/negative suffix format across all pyramid layers * fix(template): clarify test registry maintenance attribution * fix: delegate qa gap tests to testing-expert, fix code fence, update installer docs - commands/qa.md: Step 6 now delegates to testing-expert via Task tool instead of writing tests inline; update TODO to reflect current architecture - commands/tasks.md: fix malformed single-backtick example block → proper code fence - plugins/awos/agents/testing-expert.md: Step 7 uses generic "caller" instead of hardcoded /awos:implement; add explicit completion signal on no-gap path - src/CLAUDE.md: add plugins/awos/agents/ → .claude/agents/ row to copy table Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(tasks): replace per-slice testing with Feature Testing & Regression final slice * feat(qa): remove /awos:qa slash command — preserved as future plugin reference Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(regression): add /awos:regression command with dedup, confirmation, run, and report Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(regression): fix extraction logic, fully-processed check, and constraints wording Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(regression): add claude/commands wrapper for /awos:regression * fix(regression): strip wrapper to standard pattern — remove prose and double-tagged $ARGUMENTS * fix(installer): remove testing-expert auto-deploy — now hired via awos-recruitment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(verify): enforce mandatory Step 3 with fallback chain — prevent silent skips * feat(template): update regression-suite with layered format; fix stale qa-context attribution Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(testing-expert): remove from awos core — now lives in awos-recruitment registry Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(hire): add QA complement rule — auto-suggest testing-expert alongside tech agents * fix(audit): update SDD-07 to recognize Feature Testing & Regression slice model Check item 7 now handles both AWOS 2.x (single final QA slice) and the legacy per-slice QA assignment model, preventing false WARN/FAIL on projects that use the new tasks format. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(tasks): remove regression files, comment out /awos:regression sub-task Regression command moves to feat/regression branch. /awos:regression sub-task in Feature Testing & Regression slice is commented out with TODO — uncomment when feat/regression merges. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(tasks): add artifact cleanup sub-task after each implementation slice After a slice's verification, agents now delete temporary artifacts (screenshots, videos, e2e scripts) generated by e2e-tester or browser MCP. The Feature Testing & Regression slice is explicitly excluded — its artifacts are retained for the regression suite. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: fix prettier formatting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address CodeRabbit review comments - Add language tags to fenced code blocks in tasks.md and qa.md (MD040) - Add plugins/awos/agents → .claude/agents copy operation to setup-config.js to match src/CLAUDE.md documentation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: fix prettier formatting Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address review comments on hire.md and qa.md - hire.md: restructure complementary pairs to search-first pattern with testing-expert as fallback; remove duplicate Python entry; replace playwright MCP with playwright CLI - qa.md: update description to "QA health check"; add user confirmation for full-scope audit; make architecture.md required with warning; rewrite Step 3 to reflect list-of-tests.md is maintained by testing-expert, not created by /awos:qa; implement risk-based gap analysis in Step 5 (two-pass: project-level + per-AC); implement coordinator pattern in Step 6 (specialist agent writes, testing-expert validates and updates registry); add guard for missing functional-spec; remove regression suite steps (moved to PR #114); add staleness and delta-coverage notes to TODO Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * remove: delete commands/qa.md The command had no active Claude Code wrapper and its functionality is covered by two dedicated commands: - testing-expert agent (proactive test writing via /awos:tasks) - /awos:regression (regression suite management, PR #114) Retroactive staleness/gap auditing can be revisited as a plugin if needed in the future. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(tasks): add --no-tests / skip tests flag to suppress verification and testing slice Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: fix prettier formatting in tasks.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve 3 consistency issues found in PR review - hire.md line 109: playwright MCP → playwright CLI (matches line 116 and global CLAUDE.md rule; MCP vs CLI was a contradiction) - hire.md line 120: complete the Terraform/IaC entry — was truncated with no agent reference - tasks.md example: replace "chrome MCP" with playwright-cli phrasing (legacy example text, inconsistent with playwright-cli convention) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tasks): trim Feature Testing & Regression slice noise Two cleanups to the emitted slice, from testing the branch on a real spec: - Collapse the three blockquote note lines into one description. The "QA agent for this slice: `{agent}` (selected from .claude/agents/ and the Agent tool's description block)" line leaked the internal selection rationale into the generated artifact — the `**[Agent: ...]**` markers already say who runs the slice. - Drop the `` block. Its trigger lives in a different repo (the feat/regression PR), so it can't be tracked from here and just sits as dead commentary in every user's tasks.md. The `/awos:regression` wiring belongs in PR #114, which owns that command. * fix(verify): require real UI rendering + screenshots in docs/screenshots/ Reframing /awos:verify as "look-and-feel" wasn't enough — on a UI-heavy spec the agent satisfied it with in-process component tests (NiceGUI test-client / pytest) and never rendered the UI, so no visual evidence was produced. - Visual/UI acceptance criteria now MUST be verified by driving the actual running UI through the project's browser-automation tool (Playwright MCP/CLI, Cypress, chrome MCP, …). A passing component or test-client test confirms logic, not look-and-feel, and no longer counts as evidence for a visual criterion. Non-visual criteria keep the pick-by-fit freedom. - Screenshots are saved to `docs/screenshots/` — the same evidence folder the testing-expert agent (awos-recruitment) writes E2E captures to — named `<spec-directory>-<state>.png` so they sort by spec. The browser tool creates the folder on first write; verify does NOT edit .gitignore (git-ignoring docs/screenshots/ is one-time project setup), matching testing-expert's scope guarantee. The report lists the paths. - skip-tests still suppresses test suites but not look-and-feel. --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Aleksandr Makarov <amakarov@provectus.com>

FlySpot and others added 2 commits May 7, 2026 20:15

FlySpot requested a review from kmakarychev-dev May 7, 2026 18:25

chore: fix prettier formatting

544797a

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

Comment thread commands/regression.md Outdated

Comment thread commands/regression.md Outdated

Comment thread commands/tasks.md Outdated

Comment thread commands/tasks.md

FlySpot and others added 2 commits May 11, 2026 11:19

chore: fix prettier formatting

285bdba

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

FlySpot mentioned this pull request May 11, 2026

feat: Feature Testing & Regression slice, QA audit command, and verification hardening #109

Merged

4 tasks

FlySpot changed the title ~~Feat/regression~~ feat: add /awos:regression command May 11, 2026

AndreyNenashev reviewed May 11, 2026

View reviewed changes

dustyo-O reviewed May 21, 2026

View reviewed changes

FlySpot changed the title ~~feat: add /awos:regression command~~ Draft: feat: add /awos:regression command Jun 8, 2026


		### Integration

		...

Uh oh!

Conversation

FlySpot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Merge order

File map

Test plan

Uh oh!

coderabbitai Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dustyo-O May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FlySpot commented May 7, 2026 •

edited

Loading

coderabbitai Bot commented May 7, 2026 •

edited

Loading

dustyo-O May 21, 2026 •

edited

Loading