fix(session): harden teardown/restore safety + drop dead reaction flag#8
Merged
harshitsinghbhandari merged 1 commit intoMay 27, 2026
Conversation
Address PR #2 Copilot review comments on the merged LCM+SM lane: - session: validate runtime handle + workspace path before Kill/Cleanup teardown; refuse (ErrIncompleteTeardownMetadata) or skip rather than hand empty args to a real adapter's Destroy (unsafe delete). - session: reject Restore unless the session is terminal (ErrNotRestorable) so a live session can't spawn a duplicate runtime/workspace. - ports: document SpawnConfig.OpenTerminal as reserved/not yet honored. - lifecycle: remove the unread reactionConfig.auto field; note approved-and-green is notify-only (human decides to merge). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 tasks
neversettle17-101
added a commit
that referenced
this pull request
Jun 12, 2026
…viewer to worker harness Per PR #197 review feedback: - Reviewer agent posts its review to the PR itself, so remove the ports.PRReviewPoster port, the GitHub review poster, the submit HTTP route + DTO, and the service Submit method (#1, #4, #7). - Trigger spawns the reviewer agent over the worker's worktree with its own review prompt, mirroring the session launch flow (resolve agent by harness -> argv -> runtime.Create) (#8, #9). - Default reviewer harness reuses the worker's harness when supported, falling back to claude-code; reviewer config stays independent of the worker override (#5, #6). - Drop the `ao review` CLI for this PR's scope (#2, #3). Regenerated OpenAPI + TS types. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Vaibhaav-Tiwari
pushed a commit
that referenced
this pull request
Jun 14, 2026
* feat(review): configurable AO code review backend (V1)
Add per-project configurable code review of a worker's PR. A reviewer
agent runs one-shot over the worker's own worktree and posts its result
to the PR; the worker picks the feedback up through the existing SCM
observer review-nudge path.
- domain: ProjectConfig.reviewers (+ default reviewer harness), Review /
ReviewRun types and verdict/status vocab.
- storage: review + review_run tables (0011), sqlc queries, store methods.
- service/review: rewrite the in-memory stub as a persisted ReviewService
(Trigger/Submit/List) with a reviewer Runner over agent resolver +
runtime; ports.PRReviewPoster implemented on the GitHub adapter.
- http: session-scoped routes POST /sessions/{id}/reviews/trigger,
POST .../submit, GET .../reviews; regenerated OpenAPI + TS types.
- cli: ao review trigger|submit|list.
- frontend: adapt ReviewDashboard to the per-worker reviews API.
Closes #192
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): address review — drop submit/poster/CLI, default reviewer to worker harness
Per PR #197 review feedback:
- Reviewer agent posts its review to the PR itself, so remove the
ports.PRReviewPoster port, the GitHub review poster, the submit HTTP
route + DTO, and the service Submit method (#1, #4, #7).
- Trigger spawns the reviewer agent over the worker's worktree with its
own review prompt, mirroring the session launch flow (resolve agent by
harness -> argv -> runtime.Create) (#8, #9).
- Default reviewer harness reuses the worker's harness when supported,
falling back to claude-code; reviewer config stays independent of the
worker override (#5, #6).
- Drop the `ao review` CLI for this PR's scope (#2, #3).
Regenerated OpenAPI + TS types.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(review): restore ao review submit (records verdict+body in AO)
Per maintainer request, bring back `ao review submit`. AO records the
reviewer's verdict and body on the review_run and marks the pass complete;
it does not post to GitHub — the reviewer agent posts its review to the PR
itself.
- storage: add review_run.body (0011), persist via Insert/UpdateReviewRunResult.
- service: restore Submit (no SCM poster) storing verdict + body.
- http: restore POST /sessions/{id}/reviews/submit + SubmitReviewInput.
- cli: ao review submit [worker] --verdict --body (worker from arg/--session/$AO_REVIEW_WORKER).
- runner: reviewer prompt instructs posting to GitHub and recording via ao review submit.
Regenerated OpenAPI + TS types.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): move reviewer runner to its own package; sharpen prompt
Per PR #197 review:
- Move the concrete reviewer runner out of the service layer into a new
internal/review_runner package (package reviewrunner), beside other
orchestration packages like session_manager. The service keeps only the
Runner interface + RunSpec it depends on; the agent-resolver + runtime
launch flow lives in review_runner.
- Sharpen the reviewer prompt: tell the agent to diff against the PR base,
focus on high-confidence findings, post via `gh pr review`, and record
the result with `ao review submit`; review-only (no commits/edits).
- Add unit tests for the runner.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): simplify review_run schema; provider-agnostic reviewer prompt
Per PR #197 review:
- review_run: status default 'running' (drop 'pending'), drop CHECK
constraints on status/verdict, drop the updated_at column and the
session/iteration index. Propagated through queries, domain, store,
service, and tests.
- Reviewer prompt no longer hardcodes GitHub/gh commands — it instructs the
agent to use whatever review tooling the provider offers, keeping the
flow extensible across SCM providers.
Regenerated sqlc + OpenAPI/TS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): launch reviewer before persisting the run
Trigger now spawns the reviewer agent first and then writes the review_run
with a status derived from the launch outcome (running on success, failed
if it never started), instead of inserting a running row and correcting it
to failed afterwards.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): pluggable reviewer registry distinct from worker harnesses
Reviewers are now their own pluggable adapter set, separate from the worker
agent registry — adding a reviewer (claude-code today, greptile tomorrow) is
a one-line registration that does not widen the worker harness vocabulary,
and a worker harness does not automatically become a valid reviewer.
- domain.ReviewerHarness: a distinct vocabulary (AllReviewerHarnesses) with
its own IsKnown; ReviewerConfig/Review/ReviewRun use it. ResolveReviewerHarness
reuses the worker harness only when it is itself a supported reviewer, else
falls back to claude-code.
- ports.Reviewer: a reviewer-specific contract (ReviewCommand → argv + env)
that models one-shot / non-prompt CLIs natively instead of forcing every
reviewer through the worker's interactive GetLaunchCommand(Prompt:...).
- internal/adapters/reviewer: a separate registry + resolver (mirrors the
worker agent registry) with the claude-code reviewer adapter, which owns the
review prompt and reuses the worker claude-code launch construction.
- review_runner resolves via the reviewer registry (not the worker
AgentResolver) and merges AO_REVIEW_WORKER into the adapter's env.
- daemon wires the reviewer resolver. Registry/domain parity is test-enforced.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(review): cover run-scoped reviewer submit
* fix(api): update generated review submit schema
* refactor(review): split core engine (internal/review) from API service
Move the review orchestration (Trigger/Submit/List, run-id generation,
deps, RunSpec/Runner, sentinels) into a transport-independent core package
internal/review (Engine). internal/service/review is now a thin API-flow
boundary: the controller-facing Manager interface + a Service that delegates
to the engine + error re-exports.
This keeps the service layer to API concerns and lets the same engine back a
future in-process CLI trigger without going through HTTP. review_runner now
depends on the core package; daemon builds the engine and wraps it in the
service. No API/schema changes.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(review): commit-aware trigger, reviewer handle for UI, no env vars
Reworks the review trigger lifecycle and drops env-based coupling:
- review_run gains target_sha (the reviewed commit) and drops iteration.
A repeat trigger for the same PR head short-circuits to the existing run.
- review gains reviewer_handle_id: the live reviewer pane's runtime handle,
reused across passes and exposed in the reviews API so the UI can attach
its terminal over /mux.
- Trigger flow: if a live reviewer pane exists and a new commit arrived,
message it to re-review; otherwise spawn a fresh reviewer. The run is
recorded only after the reviewer is launched.
- No environment variables: the reviewer adapter embeds the explicit
`ao review submit --session <w> --run <id>` command in the spawn prompt
and the re-review message. CLI submit requires --run/--session (no env
fallbacks).
- Merge review_runner into internal/review as a Launcher (spawn/notify/alive).
- Trigger returns 201 for a new pass, 200 when reusing an existing run.
Regenerated sqlc + OpenAPI/TS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): author the reviewer prompt centrally, not in the adapter
Mirror the worker model (session_manager builds the prompt; adapters just
place it via LaunchConfig.Prompt). The reviewer prompt now lives in
internal/review/prompt.go and is passed through ports.ReviewInvocation.Prompt;
the claude-code reviewer adapter just feeds inv.Prompt to its launch command
and returns it as the re-review message. One-shot CLI reviewers may ignore it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): split reviewer prompt into system+task, mirroring buildSpawnTexts
Mirror session_manager.buildSpawnTexts for the reviewer: a standing role goes
in the system prompt, the per-pass task (PR/commit + exact `ao review submit`
command) goes in the user prompt. internal/review/prompt.go now returns
(prompt, systemPrompt); both flow through ports.ReviewInvocation and the
claude-code adapter places them via LaunchConfig{Prompt, SystemPrompt}. The
re-review message reuses the per-pass prompt (role already established in the
running pane).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Vaibhaav <user@example.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Addresses the 4 Copilot review comments on #2 (consolidated LCM+SM lane). Based on the integration branch
feat/lcm-sm-contracts— not main.Changes
Kill + Cleanup teardown validation (
internal/session/manager.go)Runtime.Destroy/Workspace.Destroy, validate the reconstructed handles. A corrupted/partially-seeded/orphaned record with an empty workspace path or runtime handle id no longer reaches a real adapter'sDestroy(which could be an unsafe delete on empty args).Killreturns the new typedErrIncompleteTeardownMetadataand records no intent / touches no adapter when handles are missing.Cleanupskips such records (surfaced inSkipped) instead of destroying with empty args; runtime best-effort destroy is gated on a valid handle.Restore guard (
internal/session/manager.go)Restorenow rejects a non-terminal session with the new typedErrNotRestorable, so a live session can't spin up a duplicate runtime/workspace for the same id and reopen its lifecycle.SpawnConfig.OpenTerminaldoc (internal/ports/inbound.go)Spawn(owned by a later lane). No terminal-opening wired (out of scope).Dead
autoflag removed (internal/lifecycle/reactions.go)reactionConfig.autowas never read, so approved-and-green (the onlyauto:falserow) already behaved correctly as a notify. Removed the field and all table usages; added a one-line note that approved-and-green is notify-only (human decides to merge).Tests
TestKill_IncompleteMetadata_RefusesTeardownTestCleanup_IncompleteMetadata_SkippedTestRestore_LiveSession_RejectedGates
gofmt -l .(clean),go build ./...,go vet ./...,go test -race ./...— all pass.🤖 Generated with Claude Code