Skip to content

feat: /awos:flow — generates the project's /implement-feature delivery flow#134

Open
AlexanderMakarov wants to merge 27 commits into
mainfrom
feat/flow-command
Open

feat: /awos:flow — generates the project's /implement-feature delivery flow#134
AlexanderMakarov wants to merge 27 commits into
mainfrom
feat/flow-command

Conversation

@AlexanderMakarov

@AlexanderMakarov AlexanderMakarov commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

What

New run-once setup command /awos:flow that generates a project's end-to-end delivery automation. Every team's SDLC differs (ticket source, git flow, repo topology, review gates, delivery, trigger), so AWOS ships a generator, not a static flow: /awos:flow investigates the repo, reads team docs, interviews the user across six dimensions, and emits two artifacts:

  • context/product/delivery-flow.md — durable decision record, one section per dimension plus a per-service Tooling Inventory and a Local Customizations section. Single source of truth; flow-agnostic so a future /fix-bug generator reuses it as a second consumer.
  • .claude/commands/implement-ticket.md — project-specific command that drives one ticket through /awos:spec → tech → tasks → implement → verify plus the team's own delivery steps (branch/worktree prep, spec commits, review gates, delivery hand-off, ticket transition). Lives in the project's own command namespace, deliberately outside .claude/commands/awos/, so neither the installer nor framework updates touch it.

Design points

  • Transport preference: investigation inventories CLI tools on PATH, MCP servers, and plugins per external service; CLI wins when both cover the same service (faster, cheaper in tokens). Chosen transport + fallback are recorded per service.
  • Worktree sub-interview: worktrees are only offered after a shared-resource questionnaire (ports, DB, docker names/volumes, tunnels, devices, non-clonable services); otherwise main-repo-only with the blocking reason recorded.
  • Regeneration contract: every generated stage is fenced with <!-- awos:flow:stage=... --> markers. Re-runs reconcile per stage — semantically unchanged stages regenerate silently; manually edited stages ask keep/drop/merge, and keep promotes the edit into Local Customizations so future regenerations preserve it automatically.
  • Unattended-safe (per /awos:* prompts gate the deliverable Write on AskUserQuestion approval, breaking claude -p #132): a skipped/unanswered question is never a stop signal — documented defaults apply (keep wins reconciliation conflicts) and both deliverable writes are unconditional.

Chain integration

  • commands/architecture.md / commands/hire.md next-command pointers: hire → flow → spec.
  • New E2E-06 check in the ai-readiness-audit end-to-end-delivery dimension recommends /awos:flow when the artifacts are missing or stale; plugin bumped 2.1.0 → 2.2.0 in both manifests.

Tests

  • Five new Layer-1 lint tests pin the contracts: flow.md path wiring (templates + both artifacts + CLI preference + Explore delegation), stage markers + AWOS chain + orchestrator guard in the skeleton, Local Customizations + Tooling Inventory sections, chain pointers, E2E-06 presence. npm test: 83/83 pass; prettier clean.
  • Behavioral coverage: two new auto e2e scenarios in awos-qa (branch feat/flow-e2e-scenarios, companion PR) — fresh generation passed 13/13 checks, re-run manual-edit preservation passed 9/9 checks against this branch. Merge this PR first; the awos-qa scenarios install AWOS from the checkout under test.

Summary by CodeRabbit

Release Notes v2.2.0

  • New Features

    • Introduced the /awos:flow delivery-flow generator to guide repository discovery and multi-step interviews, producing durable delivery decisions and an executable implementation workflow.
  • Documentation

    • Updated plugin metadata and description to better reflect delivery-flow setup assistance.
    • Added/expanded delivery-flow and feature-implementation templates to standardize prompts, gating, and regeneration behavior.
  • Tests

    • Strengthened prompt/template contract validation, including new /awos:flow coverage.
  • Chores

    • Updated ignore rules for additional local workspace artifacts.

… flow

New run-once setup command that slots after /awos:hire. It investigates
the repo (CI, Makefiles, git topology, existing commands) and inventories
per-service transports (CLI on PATH preferred over MCP — faster and
cheaper in tokens), reads team docs, interviews across six dimensions
(ticket source, git flow incl. a worktree shared-resource sub-interview,
repo topology, review gates, delivery, trigger), then generates two
artifacts:

- context/product/delivery-flow.md — durable decision record (single
  source of truth; flow-agnostic so a future /fix-bug generator reuses it)
- .claude/commands/implement-ticket.md — project-specific command driving
  a ticket through spec → tech → tasks → implement → verify plus the
  team's own delivery steps, fenced with awos:flow stage markers

Re-runs reconcile per stage: untouched stages regenerate silently;
manually edited stages ask keep/drop/merge, where "keep" promotes the
edit into the decision record's Local Customizations section. Per the
#132 pattern, a skipped or unanswered question is never a stop signal —
documented defaults apply (keep wins reconciliation conflicts) and both
writes are unconditional, so unattended runs produce the deliverables.

Also: hire/architecture next-command pointers, E2E-06 audit check
recommending /awos:flow (plugin 2.1.0 → 2.2.0), and five lint tests
pinning the template/path/marker contracts.

Verified by two new auto e2e scenarios in awos-qa
(feat/flow-e2e-scenarios): fresh generation 13/13 checks, re-run
manual-edit preservation 9/9 checks.
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces the /awos:flow command for AWOS 2.2.0: a delivery-flow generator spec (flow.md), two new Markdown templates (delivery-flow-template.md, implement-feature-template.md), version bumps in both plugin manifests, prompt-contract tests covering all new artifacts, and minor .gitignore additions.

Changes

Delivery Flow Automation

Layer / File(s) Summary
Plugin manifest and version bumps
.claude-plugin/marketplace.json, plugins/awos/.claude-plugin/plugin.json
Both manifests bump from 2.1.0 to 2.2.0, expand the description to reference /awos:flow, and add the delivery-flow keyword.
/awos:flow command spec
plugins/awos/commands/flow.md
Defines the complete delivery-flow generator: fresh vs re-run mode detection, repository investigation, seven-dimension interview model with AskUserQuestion rules, reuse/replace/compose reconciliation, decision-record and implement-feature command generation, and a required final summary.
delivery-flow-template.md decision record
plugins/awos/templates/delivery-flow-template.md
Populates the template with placeholder-driven sections for feature sourcing, branching/sync, repo topology, review gates, merge and post-merge policy, deployment/DoD, trigger options, tooling inventory with stage-automation tracking, context strategy, notifications, local customizations, and a generation log.
implement-feature-template.md staged pipeline
plugins/awos/templates/implement-feature-template.md
Defines the full per-feature pipeline: header/args, context-discipline rules, fetch/normalize + resume detection, workspace prep, sequential specs generation via AWOS chain, commit specs, /awos:implement delegation, verification, local review, commit/push, remote gate orchestration via Monitor, human-confirmed merge, delivery, and close-the-loop reporting with provenance footer.
Prompt-contract tests and gitignore
tests/lint-prompts.test.js, .gitignore
Adds plugin command discovery to the cross-reference resolver; adds three test blocks asserting contract constraints on flow.md, implement-feature-template.md (stage markers, guard text, Monitor usage, canonical stage order), and delivery-flow-template.md (required sections). Adds tmp/ and review/ to .gitignore.

Sequence Diagram

sequenceDiagram
  participant Main as Main Context
  participant Sub as Subagent
  participant Gates as Remote Gates / Monitor
  participant Human as Human Confirmation

  Main->>Main: fetch & normalize ticket
  Main->>Main: detect resume point via flow-log
  Main->>Main: prepare workspace (branch, worktree)
  Main->>Sub: /awos:spec, /awos:tech, /awos:tasks
  Sub-->>Main: commit specs
  Main->>Sub: /awos:implement (all coding delegated)
  Sub-->>Main: verify & local review
  Main->>Main: commit & push (exclude .env/secrets)
  Main->>Gates: open change-request, Monitor all gates
  Gates-->>Main: all gates pass
  Main->>Main: re-check mergeability
  Main->>Human: confirm merge
  Human-->>Main: approved
  Main->>Main: merge, post-merge CI, deliver, close loop
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

enhancement

Suggested reviewers

  • workshur

Poem

🐇 A flow has been born, from ticket to merge,
With stages and guards and a Monitor's urge,
The spec and the templates now fill every slot,
Subagents dispatch every last coding thought,
No secret commits, no nested loops astray —
The rabbit delivers a feature today! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: introducing /awos:flow, a new setup command that generates project-specific delivery flow automation. It is specific, concise, and clearly communicates the primary contribution.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/flow-command

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

…ndent dimensions, neutral factual options

Manual-run feedback on the interview:

- The team-docs question now runs as a single standalone question before
  any dimension, and reachable docs are read before the interview is
  derived — dimensions the docs answer are settled, not asked.
- Independent open dimensions are batched into one AskUserQuestion call
  (up to four questions); sequencing is reserved for answers that feed
  other questions (worktree sub-interview, connector-specific details).
- "(Recommended)" only when investigation gives evidence to prefer an
  option; factual questions get neutral options. Options that all funnel
  into the same free-text follow-up collapse into one.
@AlexanderMakarov

AlexanderMakarov commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

Example of installation in https://github.com/provectus/sde-automation (use https://jsonl.qent.io/ or alike for CC sessions):

  1. First try:
  1. At af27d93 - in spite of the bigger diff with more features it consumed 48% less tokens (~359k), was 41% faster (2h11m), context size 21% less (258k), more sophisticated flow (fallback; fix of settings UI is included):

Aleksandr Makarov added 6 commits June 10, 2026 19:30
…ersists pointers; broaden CLI probe

Second round of manual-run feedback:

- Step 3 now shows which process docs the investigation already found
  before asking whether more exist; the question is about what's beyond
  that list.
- Doc pointers are inherently free-form — one listed option ("No —
  that's everything") plus the built-in free-text input, which accepts
  any locator: URL, file path, or resource name + identifier (Slack
  channel/message, Google Doc title). No follow-up multiple-choice about
  where docs live, and never an option whose description says to pick
  "Other" instead.
- Every pointer (found or provided) is persisted in the decision
  record's Team Docs Consulted list so re-runs and future flow
  generators re-read instead of re-asking.
- Step 2 CLI probe extended with aws and playwright (cloud CLIs for the
  deployment target, browser automation for UI verification gates).
The flow ended at PR creation; a live run showed CI failing right after.
Extend the generated /implement-ticket past that point:

- implement-ticket-template: new ci-monitor stage (platform checks via the
  chosen transport, wait-and-fix loop or hand-off per recorded policy) and
  merge stage (human-merges vs. flow-merges, post-merge pipeline watch).
  The per-run merge confirmation is fixed prose: a skipped confirmation
  means do not merge — the deliberate inverse of the #132 skip-default,
  because merging is irreversible.
- delivery-flow-template: §4 gains the CI-on-change-request gate; §5 gains
  Merge policy and Post-merge CI fields.
- flow.md: Step 2 maps CI triggers (change request / push / merge-to-base),
  dimensions 4–5 interview the policies; all stages stay CVS-agnostic —
  capabilities bound to §7 transports, degrading to local git merge and
  the local test suite for repos without a code host.
- lint tests: assert both stage markers, the confirmation guard, and the
  §5 fields.
…view independence

A live end-to-end run reached 327K tokens in one window with zero
compactions: the whole AWOS chain expanded inline, 28 subagent reports
accumulated, and review findings were triaged by the same context that
implemented the change — at peak degradation. Make the generated command
manage its window by construction:

- delivery-flow-template: new §8 Context Strategy (mode, subagent/main
  stage split recorded with reasons, flow-log pointer); Local
  Customizations renumbered to §9. §6 trigger setup notes record the
  operator-owned prerequisites for unattended headless chaining.
- implement-ticket-template: fixed-prose Context Discipline — isolatable
  stages run in subagents (Skill tool, terse reports: paths, verdicts,
  counts), every stage appends to context/spec/{SPEC_NAME}/flow-log.md
  (the flow's memory outside the window; fresh sessions resume from it),
  nested headless sessions are forbidden inside the command. Stage
  assignments are criteria-phrased per §8, not hardcoded per command.
  Review stage gains three independence rules as fixed prose: the
  reviewer prompt is fixed at generation time (no run-time focus areas
  from the author), the reviewer writes the review file itself, the
  fix agent reads file + diff fresh.
- flow.md: context strategy is derived, not asked — two criteria decide
  the subagent/main split at generation time; unattended operation is a
  stated core goal, enabled by the flow log. Step 6 writes the reviewer
  prompt out in full from §4.
- lint tests: assert the Context Strategy section, the flow-log
  contract, the nested-headless prohibition, and the review-independence
  rule.
…er question, settled vocabulary, review bots

Four failures observed in real /awos:flow sessions:

- A single-option AskUserQuestion call crashes (schema requires 2-4
  options): the docs question said 'one listed option suffices' and the
  retry invented a redundant 'Yes — more exists' filler. The question
  now pairs 'No — that's everything' with the genuinely different
  'Don't rely on the found docs — interview me from scratch', and the
  INTERACTION section states the 2-4 bound (confirmations pair the
  action with its refusal).
- Combinable choices were asked as forced single picks, and the
  post-merge options fused two axes (merge cut-point x local deploy),
  making combinations like 'fix forward + reinstall' unselectable. New
  rules: combinable answers are one multiSelect question; one question
  decides one axis — an orthogonal step (a local deploy) is its own
  question, asked as when/whether to run it.
- Options used concepts the decisions ruled out ('transition the
  ticket' on a ticketless project) and unexplained shorthand ('per-run
  confirm'); 'how far' questions mixed event names. New rules: settled
  project vocabulary only; cut-point options anchored to the same
  concrete events, earliest stop to full automation. The trigger
  option now says what the user gets ('get setup suggestions') and the
  summary must deliver the concrete configuration.
- One generated option moved /awos:verify after merge; the chain order
  is now stated as not-a-dimension (verify precedes commit/push/PR) and
  the template's canonical stage order is a lint-enforced contract.
- Automatic reviewers (CodeRabbit-style bots) were never asked about:
  Step 2 detects them (config files, bot reviews on recent change
  requests), dimension 4 gates on them, both templates carry the slot.
  Definition of Done de-ticketed for ticketless sources.
… gates, model tiers, gate placement

Wall-time and cost levers for the generated /implement-ticket, from a
3h41m observed run and the code-review plugin's patterns:

- Stage restructure: verify → local-review → commit-push → remote-gates
  → merge. Local AI review runs before anything is pushed, so CI minutes
  are spent only on reviewed code; the onex-style ordering survives as
  the §4 change-request-timing decision (open the PR first → review runs
  concurrently with remote gates, one extra CI run on unreviewed code).
  remote-gates waits on CI, the bot reviewer, and human review
  concurrently and joins them before merge; polling intervals match the
  typical pipeline duration Step 2 now records.
- Approval gates are a §4 interview decision with the trade-off spelled
  out: gates after spec and after tech (most reliable — the functional
  spec is the hardest artifact to get right first-pass, and an error
  caught there never propagates) vs one gate after both (faster, risk of
  reworking two documents) vs none (unattended, pre-written specs).
  tasks.md gets no gate by default — commands/tasks.md writes without
  approval per #132, so /awos:implement starts right after /awos:tasks;
  the confirmation summary must state this default.
- Speed discipline: fast-tier preflight in resume-detection (ticket
  already delivered → stop), the no-exploratory-calls rule passed to
  every subagent, and a model tier per delegated stage recorded in §8 as
  tiers-with-reasons (fast for mechanical transport work, strongest for
  judgment — never model names, names drift).
- Lint: stage markers renamed (local-review, remote-gates), canonical
  order enforces that verify and local-review precede commit-push.

Out of scope (own PR): parallel independent slices in /awos:tasks +
/awos:implement — the largest remaining wall-time lever.
…aiting

- Conflicts with the target branch are checked twice: remote-gates
  opens with a fetch + dry-run merge/rebase before the change request
  exists, and the merge stage re-checks mergeability after gates pass —
  other actors move the target while gates run. On conflict: a subagent
  resolves (non-trivial resolutions confirmed with the user), local
  gates re-run, push, and the remote gates run again on the new commit.
  Policy lives in §2 (rebase vs merge-in); when investigation finds an
  existing branch-sync/git-flow command or skill, the generated stage
  reuses it instead of inlining the recipe.
- Remote gates wait via the Monitor tool, never foreground sleep loops:
  the poll loop emits each gate's terminal result and exits when all
  settle, timeout sized to the typical pipeline duration recorded in
  §4, 30s+ intervals against remote APIs, and the filter covers every
  terminal state — a monitor that only greps the success marker stays
  silent through a failed run.
- Lint: assert the Monitor rule and both conflict checkpoints.
@AlexanderMakarov AlexanderMakarov marked this pull request as ready for review June 11, 2026 17:25
Aleksandr Makarov added 2 commits June 15, 2026 17:01
…llision detection

From the SDLC investigation of two AWOS-adopting repos (Barley, HOPS) —
both validated the design and surfaced three cross-project gaps. Kept
general; those repos are one direction, not the whole variety:

- Existing project automation is discovered (Step 2 now scans
  .claude/skills/ and commands overlapping a stage) and evaluated, not
  adopted on sight. New Step 4.5 compares each against the generated
  stage and reaches reuse/replace/compose — a team's command may be
  thin, stale, or miss what the generated stage handles (review
  independence, merge confirmation, resume-detection). Close calls go to
  the user as a data-supported question with the evidence in it; a
  skipped question keeps the team's automation untouched. Step 6 wires a
  reused command in by name; the Tooling Inventory records the per-stage
  decision so re-runs don't regenerate over it.
- Notifications is the seventh dimension (decision-record §9 + a
  generated-command standing rule): the flow announces transitions
  (spec ready, CR opened, merged, deployed, blocked) to a team channel
  so awareness survives gate removal — human on the loop, not in it.
- Collision detection: a command already driving a large span of the
  flow is surfaced before generating, not silently competed with.

Local Customizations renumbered §9→§10 (§1–§8 untouched). Five new lint
assertions; 83 pass.

@workshur workshur left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is good, but I’m not sure we should pull this into the main AWOS flow right away.

Project flows can differ a lot, so I’d first test this properly. That said, I think we could add the command to the plugin after a small round of refinement.

  1. I don’t really like the command name /implement-ticket. I don’t like the mention of a ticket in general. Maybe this is more about implementing a feature, and it shouldn’t really matter where the requirements come from.

  2. I’d like to see this more as an assistant that helps set up the flow and points out things that could still be automated.

  3. Some steps, like merge, require special attention.

- **Warn:** One artifact exists without the other, or the decision record is visibly stale — recommend re-running `/awos:flow`
- **Fail:** Neither exists — recommend running `/awos:flow` to generate the delivery flow
- **Skip-When:** Spec-driven-development artifact shows the project does not use AWOS (no `context/` directory)
- **Severity:** medium

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that enforcing such strict end-to-end flow checks is necessary here. While automation can boost the rating, it shouldn't be calculated by awos artifacts. Instead, we should gather more evidence from indirect sources like hooks, review commands, CI, etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — removed E2E-06. Scoring delivery automation off the presence of AWOS's own generated files is marking our own homework, and it unfairly fails projects that automate via CI / hooks / review commands. That scoring belongs in the sdlc-automation dimension (on feat/sdlc-automation-audit), whose SDLC-01/02 already measure it from indirect evidence; end-to-end-delivery stays focused on vertical-slice delivery.

Aleksandr Makarov added 2 commits June 16, 2026 19:49
…re rename, drop E2E-06

Addresses workshur's review on #134, which asked to keep /awos:flow out of the
main AWOS flow for now, deliver it via the plugin after refinement, drop the
"ticket" framing, and treat the command as a setup assistant rather than a
rigid end-to-end driver.

- Audit: remove the E2E-06 check from end-to-end-delivery.md (and its lint
  test). Scoring delivery automation by the presence of AWOS's own generated
  artifacts is circular; automation-coverage scoring belongs to the
  sdlc-automation dimension, which measures it from indirect evidence
  (CI, hooks, review commands).
- Setup chain: un-wire /awos:flow from commands/hire.md and
  commands/architecture.md so it is no longer part of the run-once setup chain.
- Rename: the generated command /implement-ticket -> /implement-feature, and
  reframe intake as "a feature, wherever the requirements come from." Rename
  templates/implement-ticket-template.md -> implement-feature-template.md.
  Stage-marker slugs (awos:flow:stage=...) are unchanged — they name the
  generator, not the generated command.
- Posture: reframe flow.md as a delivery-flow setup assistant that also reports
  which steps stay manual / could still be automated, instead of implying full
  automation.
- Plugin delivery: move the command from the core installer
  (commands/flow.md + claude/commands/flow.md wrapper) to a plugin command at
  plugins/awos/commands/flow.md, so it is opt-in via the marketplace. Its
  generation templates stay installer-delivered in .awos/templates/, consistent
  with the .awos/commands and context/ artifacts the command already operates
  on. No migration needed: /awos:flow only landed on this branch and was never
  released. Widen the plugin/marketplace descriptions accordingly.

Tests updated to match (plugin command path, renamed template, plugin-provided
command cross-reference). Full suite green (81 pass), prettier clean.
… self-contained

The prior commit delivered /awos:flow as a plugin command but left its two
generation templates on the installer channel (.awos/templates/). Because the
plugin and the AWOS installer are independent delivery channels, a project with
the updated plugin but an older (or un-refreshed) .awos/ install — e.g.
sde-automation, whose .awos/templates/ predates the flow feature — runs the
command and finds no scaffolds:

  "The .awos/templates/delivery-flow-template.md and
   implement-feature-template.md scaffolds aren't installed."

The templates are the command's own private scaffolds, not project artifacts it
operates on, so they belong with the command. Move them into the plugin
(plugins/awos/templates/) and reference them via ${CLAUDE_PLUGIN_ROOT}/templates/
instead of .awos/templates/. The plugin is now self-contained: installing or
updating it ships the command and its scaffolds together, with no installer
dependency. Generated outputs (context/product/delivery-flow.md,
.claude/commands/implement-feature.md) are unchanged.

Tests updated (plugin templates dir + required-ref paths). Suite green (81),
prettier clean.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/awos/templates/implement-feature-template.md (1)

43-48: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make rerun precedence explicit when the log and artifacts disagree.

The resume step reads flow-log.md first, then scans context/spec/, but it never says which source wins after a partial/manual rerun. That can resume from the wrong stage or skip one entirely.

♻️ Proposed fix
-Start with a cheap preflight on the fast model tier (per §8): is this feature already delivered — a merged change request, a recorded Done? If so, report that and stop. Then: if `context/spec/{SPEC_NAME}/flow-log.md` exists, read it first — it names the last completed stage; resume from the next one. [Per §1: if a spec directory for this feature may already exist under `context/spec/`, inspect it and resume from the first missing artifact — skip `/awos:spec` if `functional-spec.md` exists, skip `/awos:tech` if `technical-considerations.md` exists, and so on. Omit the pre-written-spec handling if specs never arrive pre-written.]
+Start with a cheap preflight on the fast model tier (per §8): is this feature already delivered — a merged change request, a recorded Done? If so, report that and stop. Then: if `context/spec/{SPEC_NAME}/flow-log.md` exists, read it first, but treat the first missing on-disk artifact as the authoritative next stage when the log and tree disagree; repair the log to match before continuing. [Per §1: if a spec directory for this feature may already exist under `context/spec/`, inspect it and resume from the first missing artifact — skip `/awos:spec` if `functional-spec.md` exists, skip `/awos:tech` if `technical-considerations.md` exists, and so on. Omit the pre-written-spec handling if specs never arrive pre-written.]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/awos/templates/implement-feature-template.md` around lines 43 - 48,
The resume detection logic in Step 2 has ambiguous precedence when flow-log.md
and the artifact scan of context/spec/ disagree about the completion stage.
Clarify which source takes precedence: explicitly state whether flow-log.md
(read first) takes priority over the artifact scan, or establish a clear
conflict resolution rule. Add explicit language in the bracketed reference to §1
(the artifact inspection part) that defines what happens if the flow-log stage
and the artifact scan stage differ, ensuring there is no possibility of resuming
from the wrong stage or skipping stages during partial or manual reruns.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/awos/templates/delivery-flow-template.md`:
- Around line 56-60: In the "6. Trigger" section under "Supported: manual", the
`/implement-feature` command example currently only mentions ticket IDs or links
as input, but the command contract actually accepts file paths as well. Update
the manual trigger example in the supported line to reflect all accepted input
shapes, including file paths alongside ticket IDs and links, to accurately
represent what the command accepts and prevent confusion for users running
local-file or spec-driven scenarios.

---

Outside diff comments:
In `@plugins/awos/templates/implement-feature-template.md`:
- Around line 43-48: The resume detection logic in Step 2 has ambiguous
precedence when flow-log.md and the artifact scan of context/spec/ disagree
about the completion stage. Clarify which source takes precedence: explicitly
state whether flow-log.md (read first) takes priority over the artifact scan, or
establish a clear conflict resolution rule. Add explicit language in the
bracketed reference to §1 (the artifact inspection part) that defines what
happens if the flow-log stage and the artifact scan stage differ, ensuring there
is no possibility of resuming from the wrong stage or skipping stages during
partial or manual reruns.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f7150095-7c9c-4d1e-a27f-0fce6d52703e

📥 Commits

Reviewing files that changed from the base of the PR and between b86d9d4 and 6a99e14.

📒 Files selected for processing (7)
  • .claude-plugin/marketplace.json
  • commands/architecture.md
  • plugins/awos/.claude-plugin/plugin.json
  • plugins/awos/commands/flow.md
  • plugins/awos/templates/delivery-flow-template.md
  • plugins/awos/templates/implement-feature-template.md
  • tests/lint-prompts.test.js
💤 Files with no reviewable changes (1)
  • commands/architecture.md
✅ Files skipped from review due to trivial changes (1)
  • .claude-plugin/marketplace.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/lint-prompts.test.js

Comment thread plugins/awos/templates/delivery-flow-template.md
Aleksandr Makarov added 2 commits June 17, 2026 16:03
…ture, review-path

- architecture.md: revert the stray wording change so the file is net-zero vs
  main (W1 only needed to drop the /awos:flow bullet, not reword the next-step
  line; "after that" was vaguer than the original "after /awos:hire").
- Plugin/marketplace descriptions: explain that the plugin extends AWOS core
  with AI-powered delivery automation — it audits AI-readiness
  (/awos:ai-readiness-audit) and sets up an end-to-end delivery flow (/awos:flow).
- flow.md Step 4: break the two heaviest dimensions (Review requirements,
  Delivery requirements) from dense run-on paragraphs into a lead sentence plus
  labelled sub-bullets, so the interview reads as discrete decisions.
- implement-feature-template.md local-review stage: report the review file's
  PATH to the user (cheap, lets them open the full review) while still keeping
  the review body out of the context window. Closes the gap where a run
  surfaced verdict+count but never told the user where the review lived.
…precedence

Addresses CodeRabbit review on #134.

- delivery-flow-template.md §6 Trigger: the manual example narrowed
  /implement-feature to <ticket-id-or-link>, but the command accepts file paths
  and pre-written specs too. Widen both the supported and headless examples to
  "<feature — ticket ID, link, or file path>", matching the command contract.
- implement-feature-template.md Step 2 (resume detection): make the precedence
  between flow-log.md and the context/spec/ artifact scan explicit. For the
  spec-generation stages the on-disk artifacts are authoritative when they
  disagree with a stale log (manual/partial reruns), and the log is repaired to
  match; past spec generation there is no such artifact, so the log stays the
  only resume signal. (Scoped deliberately — the reviewer's blanket
  "first missing artifact wins" would break resume for implement/verify/merge,
  which have no context/spec artifact.)
@AlexanderMakarov

Copy link
Copy Markdown
Contributor Author

@workshur — thanks, agreed on the direction; reworked the PR to match.

  • Main flow / plugin (A0): un-wired /awos:flow from the /awos:hire/awos:architecture setup chain and moved the command into the plugin (plugins/awos/commands/flow.md), opt-in via the marketplace rather than part of the run-once chain. Its templates ship bundled in the plugin too, so it's self-contained — no installer dependency, and no migration (it only landed on this branch).
  • Naming (A1): generated command renamed /implement-ticket/implement-feature; intake reframed as "a feature, wherever the requirements come from."
  • Posture (A2): reframed from a rigid end-to-end driver to a setup assistant that flags which steps stay manual / could still be automated, instead of asserting full automation.
  • Merge (A3): kept the per-run merge confirmation and the double target-branch conflict check; merge is called out as a special-attention step.
  • E2E-06: removed (see the inline thread) — automation-coverage belongs in the sdlc-automation dimension.

Left the threads open for you to confirm.

@AlexanderMakarov AlexanderMakarov changed the title feat: /awos:flow — generates the project's /implement-ticket delivery flow feat: /awos:flow — generates the project's /implement-feature delivery flow Jun 17, 2026
@AlexanderMakarov

Copy link
Copy Markdown
Contributor Author

Rewrote to plugin while it is the same way to run it - /awos:flow. Now it generates /implement-feature command. Plus added few minor corrections.

Comparison with #134 (comment) last session and new (use https://jsonl.qent.io/ or alike for JSONL CC sessions investigation).

Metrics: token usage = input + output (non-cache, billable); context max = peak input + cache_creation + cache_read on a single request; agent working = wall-clock from the slash-command to last activity before the session idled.

Command CC sessions Model Token usage (in+out) Agent working Context max Files PR
/implement-ticket (last) (no /awos:flow) /implement-ticket claude-fable-5 ~402k (in 199k + out 203k) 2h11m 258k delivery-flow.md, implement-ticket.md #9 +4,496 -163
/implement-feature #1 /awos:flow, /implement-ticket claude-opus-4-8 ~556k (in 140k + out 415k) 1h25m 319k missed both files #11 +3,632 -87
/implement-feature #2 /awos:flow, /implement-ticket claude-opus-4-8 ~672k (in 125k + out 548k) 2h35m 415k implement-feature.md, delivery-flow.md #12 +4,076 -112, includes few extra UI features and tried worktrees

Notes:

  • /implement-feature runs on Opus 4.8 [1m]; the /implement-ticket baseline ran on Fable 5 — token/speed deltas are partly model-driven, not just flow changes.

Aleksandr Makarov added 2 commits June 18, 2026 12:57
Runs kept presenting the review findings without the path to the review file,
even though the generated command already instructed the path three times — an
adherence problem, not a missing instruction: the model jumps to the findings
list and drops a trailing path mention.

Resolve it with determinism instead of more instruction:
- Name the reviewer's output path as a fixed convention,
  context/spec/{SPEC_NAME}/review.md (a reused review command may differ —
  capture whatever path it used), and have the subagent return it.
- Require the review presentation to LEAD with that path on its own line,
  before the verdict and findings. A fixed opening line is emitted reliably; a
  path appended after a long findings list gets dropped.
- Record the path in the stage's flow-log entry so it survives outside the chat.
…idance

Address /awos:flow road-test friction (PR #134):
- Re-run now collects a granular per-dimension multiSelect (Step 1.3) and
  interviews only the chosen dimensions, bulk-confirming the rest — no more
  re-reviewing all seven with current-value defaults.
- De-steer the approval-gates question into an autonomy spectrum labeled by
  the pauses each option imposes; stop pre-marking the most-gated option.
- Make autonomy holistic: Step 4.5 treats a reused skill's per-run
  confirmations as a reuse decision (reuse-with-prompt vs. compose a
  non-interactive path that keeps validation), and Step 8 reports an
  Interaction budget enumerating every human-pause by source.
- Step 8 tells the user to commit the generated artifacts so the first run
  starts clean; the template's workspace stage flags uncommitted AWOS
  artifacts as an expected dirty-tree cause.
- Guard the generated header against nested stage-marker comments (the inner
  --> that closed an outer comment early).

Tests added for each behavior; existing flow contracts preserved.
Aleksandr Makarov added 10 commits June 30, 2026 20:04
The Jira-URL class of bug: a reused skill hardcoded provectus.atlassian.net
while the instance was provectus-dev.atlassian.net; the dead link surfaced at
runtime and was mis-blamed on the generated command.

- Step 2 now captures canonical project config (Jira base URL, Slack channel
  + handles, code-host org/repo) and the format/lint gate scope.
- Step 4.5 scans a reused skill for hardcoded constants and reconciles them
  against the captured config, asking on a mismatch (fix the skill or record
  an override).
- delivery-flow-template.md gains a flow-agnostic Project Setup section so the
  sibling fix-bug flow reuses the same facts.
- Step 8 advises the project-side ignore fix for the repo-wide-lint case; no
  format pass is added inside the flow.
The generated /implement-feature runs a real local code-review gate, but the
close stage reported only the change-request link / merge commit / deploy
confirmation — the review verdict, finding count, and review-file path were
buried in the logs. Step 13 now reports them as part of the close evidence.
Teach /awos:flow to emit an optional second generated command, fix-bug.md,
beside implement-feature.md from the same flow-agnostic decision record. The
bug-fix flow is lighter — diagnose → fix → scoped re-verify → targeted spec
amendment — and closes the spec-drift loop a behavior-changing fix opens.

- commands/spec.md gains an Update Mode (mode detection + edit-in-place body),
  mirroring the Step 2A pattern in product/roadmap/architecture; it amends an
  existing spec in place, appends a dated ## Change Log entry, never allocates
  a new index, and leaves a Completed Status untouched.
- claude/commands/spec.md argument-hint signals the amend path.
- templates/functional-spec-template.md adds the canonical ## Change Log
  section — the amendment target.
- plugins/awos/templates/fix-bug-template.md (new): orchestrator-only skeleton
  with the canonical stage order fetch-bug → … → close-ticket, the classify
  gate (conformance vs. divergence), an amend-spec stage that invokes
  /awos:spec in update mode, skip-tests handling, and the local-review-in-
  report close treatment.
- flow.md adds the bug-fix opt-in interview decision, Step 6 generates
  fix-bug.md with the same reconcile-on-rerun behavior, and Step 8 reports it.
- delivery-flow-template.md gains a flow-agnostic Bug-fix Flow section.
- Bump plugin.json + marketplace.json to 2.3.0; README documents the flow.

Tests added for the spec Update Mode, the Change Log section, the fix-bug
template (stage order, classify gate, /awos:spec invocation, reuse), and the
flow wiring.
Generalize the bug-fix opt-in into a first-class "Command set & names"
decision: the team chooses which commands /awos:flow generates (feature flow,
bug-fix flow, or both) and the slash-name for each (defaults /implement-feature
and /fix-bug; renameable to e.g. /feature, /fix). The chosen names and
filenames are recorded in a new Generated Commands field of delivery-flow.md so
re-runs reconcile the right files. flow.md keeps the default paths so existing
contracts hold; Step 6 writes to the recorded filename.

From the Everclear road-test (Eugene Zaychenko named his commands
/everclear:workflow and /everclear:fix).
The generated fix-bug command's fetch-bug stage gains a crash-report path
(Crashlytics, Sentry, …): fetch the issue + recent events via the §7 transport,
map app-frames to local file:line, refuse to invent line numbers on an
unsymbolicated build, capture impact, and prefer a source-typed branch name.
The close stage may write an investigation note back to the issue without
auto-closing it. The bug source is interviewed as part of the bug-fix policy.

From Eugene Zaychenko's /everclear:fix.
C3 — record a ticket-state transition map (events → tracker states) across the
whole cycle, including the failure path (a failed gate/review sends the ticket
back to a needs-work state), not just the closing DoD transition. Broaden §4's
automatic-reviewer gate to cover a project-built CI AI reviewer (a GitHub
Action calling Claude), whose pass/fail can drive the transition. Both
generated templates transition the ticket on the mapped remote-gate events.

C6 — settle a max-wait & escalation policy for remote-gate waits (auto-relaunch
on poll-window expiry, ask the human past a threshold); the generated Monitor
loop sizes its timeout to it instead of waiting forever.

From Eugene Zaychenko's AI-code-review GitHub Action and CI-polling rules.
C4 — the Step 2 tooling inventory records the project's build/verify toolchain
as a transport (iOS XcodeBuildMCP/xcodebuild/simulator, Android Gradle/emulator,
…), not just web browser automation; the verify and CI-fix stages drive it.

C5 — the generated feature command resumes from the next incomplete roadmap
item when invoked with no input (like /awos:spec). Both generated templates'
resume-detection now check completion status across every source §1 records
(tickets can live in several places) and stop when the ticket is already
Done/closed or the owning spec is already Completed — no re-implementing
delivered work. §1 captures the tracker's done-state names for this.

From Eugene Zaychenko's no-arg roadmap pickup and build_sim verification.
…ser to run it

A road-test of the generated /fix-bug paused at verify-touched-criteria
and handed the user a manual `! make run` for the live check, because
the workspace shared-resource guardrail reserved the app's port and the
verify guidance treated the AskUserQuestion handoff as a normal path.

Running the app to verify is the flow's job, not the user's:
- verify.md: a reserved shared resource is grounds to reclaim it / use
  an alternate port / drive the deploy, not to hand off a run command;
  manual confirmation is a last resort for when no agent-driven render
  path exists at all.
- fix-bug and implement-feature templates: verify stage references the
  §2/§3 sanctioned verification path and never defers a drivable
  criterion to a later manual deploy.
- flow.md: record a sanctioned verification path whenever a shared
  resource blocks, so generated commands self-verify.
- delivery-flow-template.md: new §2 field capturing that path.
- lint: new contract locking the guidance into all four prompt files.
…ommittable leftover

A road-test left context/spec/006-settings-page/flow-log.md dirty after
the run: the close stage appended an entry after the last commit, and
the PR was already merged, so the entry could never land in it.

The flow-log is appended after every stage by design, including stages
that run after the final commit — so tracking it naively always leaves a
trailing delta that can't reach a merged or in-review change request.

Keep the log committed, but bound when it is written:
- commit-push finalizes it — the stage writes its entry before staging so
  the log rides in that commit (its last committed state).
- Once the change request is opened or merged, the flow stops writing to
  the tracked log; late-stage progress goes to the user and §9
  notifications, and the remote stages resume from remote state (open/
  merged change request, ticket status), which resume-detection already
  reads.
- The close stage guarantees a clean working tree and surfaces any
  uncommitted flow artifact rather than leaving it behind.

Baked into both flow templates, flow.md context-strategy, and the
decision record; new lint contract locks it in.
…eouts

The harness returns "No response after 60s" when an AskUserQuestion goes
unanswered — a guard so headless runs never hang. A road-test hit it in
an *interactive* session: the agent silently defaulted while the user was
still deciding. The 60s timer is the harness's and can't be changed by a
prompt, but the reaction is ours to shape — given a signal to branch on.

Add an explicit env-var contract instead of trying to auto-detect the
session type (both timeouts look identical otherwise):

- Unattended drivers (cron / /loop / claude -p) export AWOS_UNATTENDED=1;
  a no-answer is expected, so take the safe default and continue.
- Interactive runs leave it unset; a 60s timeout usually means the user
  is thinking or briefly away, so re-ask once, then proceed naming the
  default taken so it can be corrected.
- Either mode: a timeout never authorizes an irreversible step — the
  merge and divergence spec-amendment confirmations stay a no.

Baked into flow.md (INTERACTION + §6 trigger prerequisites), both flow
templates' Context Discipline, and the decision record's §6; new lint
contract locks it in.
@AlexanderMakarov AlexanderMakarov requested a review from dustyo-O July 2, 2026 14:44
# Conflicts:
#	tests/lint-prompts.test.js
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants