feat(session): Session Manager — spawn/kill/list/get/send/restore/cleanup#7
Conversation
…store/cleanup) Implements ports.SessionManager against fakes for the outbound ports. The SM is the explicit-mutation half of the lane: it drives Runtime/Agent/Workspace, seeds the initial lifecycle, and routes outcomes to the LCM (OnSpawnCompleted / OnKillRequested). It never derives observed state and is the single producer of the derived display status (attached on read, never persisted). - Spawn: Workspace.Create -> Runtime.Create (AO_* identity env) -> Seed -> OnSpawnCompleted, with eager rollback of completed steps on failure. - Kill: OnKillRequested first -> Runtime.Destroy -> Workspace.Destroy, honoring the worktree-remove safety (refusal surfaced, never forced). - List/Get: derive status via DeriveLegacyStatus. Send: via AgentMessenger. Restore: re-seed (reopen) + relaunch via GetRestoreCommand. Cleanup: reclaim terminal sessions, skip worktrees holding uncommitted work. Store-contract additions (co-owned with Tom's persistence layer, flagged for review): LifecycleStore.Seed (explicit create-with-identity; OnSpawnCompleted requires a seeded record) and LifecycleStore.Get (single record-with-identity read; Load is lifecycle-only). Lifecycle test fake updated to satisfy both. Tests route through the real LCM Manager (wrapped to record call order). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Implements a new ports.SessionManager in backend/internal/session/ to own explicit session mutations (spawn/kill/send/restore/cleanup) while delegating canonical lifecycle transitions to the existing LCM, and deriving the display Status only on reads.
Changes:
- Added
session.Managerimplementing the SM pipelines for Spawn/Kill/List/Get/Send/Restore/Cleanup with rollback behavior on spawn failures. - Extended
ports.LifecycleStorewithSeed(create-with-identity) andGet(single record read) to support SM needs beyond lifecycle-onlyLoad. - Added an in-package fake harness and tests that exercise SM behavior against the real lifecycle manager plus faked outbound ports.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/internal/session/manager.go | New Session Manager implementation wiring outbound ports, store seeding, LCM notifications, and read-model derivation. |
| backend/internal/session/manager_test.go | SM tests covering spawn/kill/list/get/send/restore/cleanup flows using a harness. |
| backend/internal/session/fakes_test.go | Fakes for store/runtime/agent/workspace/messenger + recording LCM wrapper to assert pipeline ordering. |
| backend/internal/ports/outbound.go | Adds Seed and Get to LifecycleStore contract with explanatory docs. |
| backend/internal/lifecycle/fakes_test.go | Updates lifecycle test fake store to satisfy the expanded LifecycleStore interface. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
harshitsinghbhandari
left a comment
There was a problem hiding this comment.
Review — Session Manager (reviewed via subagent + independent verification)
Verified in a throwaway worktree: gofmt clean, build/vet pass, go test -race green; internal/session at 67.5% coverage. Spec-faithful and the invariants hold — display status is derived only on read (toSession) and never persisted, kills route through LCM.OnKillRequested, and the tests exercise the real lifecycle.Manager (not a fake LCM) for the round-trips, which is exactly right. Two should-fix items before this is clean; the rest is minor/doc-level.
1. LifecycleStore.Seed + Get — clean ✅
Seed taking a full SessionRecord and rejecting an existing id (create-only), with Restore routed through PatchLifecycle instead, cleanly separates create from reopen and avoids a double-seed footgun. Get is not redundant with Load (lifecycle-only, hot path) or List (project-scoped) — the SM needs identity by id for teardown/read-model. Both are ordinary CRUD for a real store. Minor: a one-line doc noting Load = hot-path lifecycle-only vs Get = full record would help Tom.
2. Spawn rollback + orphan handling — logic right, path untested (should-fix)
Eager rollback is correct (late seed; each step unwinds the prior). The OnSpawnCompleted-failure orphan route (seed + runtime up, then route to terminal-errored via OnKillRequested(KillError) + teardown) is the right call — but there is no test for it, and it's the most subtle path flagged. Add a test (fake LCM whose OnSpawnCompleted errors) asserting terminal-errored + OnKillRequested(KillError) fired + runtime/workspace torn down. Also untested: workspace-create-failure rollback, and Seed duplicate-id rejection.
3. Kill ordering + worktree-remove refusal — correct & tested ✅
OnKillRequested → Runtime.Destroy → Workspace.Destroy, asserted by call-order indices; refusal surfaces as an error with WorkspaceFreed=false and the path is never force-deleted. Note: the spec's explicit "Agent stop" step is folded into Runtime.Destroy (agent runs inside the runtime) — defensible, but call it out in the PR body so the Agent port's stop semantics aren't assumed elsewhere. Minor: Kill reconstructs handles from metadata; a seeded-but-never-completed session yields empty handle/path to Destroy — should be a safe no-op but isn't guarded/tested.
4. Restore — no rollback → runtime leak + stranded state (should-fix)
Restore does Workspace.Restore → Runtime.Create → PatchLifecycle(reopen) → OnSpawnCompleted, but on a failure at the reopen or OnSpawnCompleted step it returns the error without destroying the runtime it just created → orphan runtime. Worse, if OnSpawnCompleted fails after the reopen patch landed, the record is left not_started (display: spawning) with a live runtime and no recovery path. Spawn handles exactly this; Restore should mirror it (roll back the runtime on reopen/OnSpawnCompleted failure). No Restore failure path is tested. The reopen itself (PR → none/cleared_on_restore, session → not_started/spawn_requested, via PatchLifecycle not Seed) is correct and tested on the happy path.
5. Tests against the real LCM — yes ✅
recordingLCM wraps the real lifecycle.New(...) and only logs call order, so OnSpawnCompleted/OnKillRequested run the real load→decide→persist pipeline and the kill test asserts the LCM produced terminated/manually_killed. Coverage gaps are the error paths above (orphan route, all Restore-failure paths) — that's what's holding 67.5%.
Minor
Sendhas no existence/terminal guard — forwards blindly toAgentMessenger. Probably fine (messenger owns busy-detect/verify), but confirm it's intentional.Cleanupdrops a runtime-destroy error silently even when the workspace is later reclaimed — fine for terminal sessions, undocumented.buildPromptis a documented stub for the 3-layer assembly — fine for this split.
Recommendation: Approve-with-follow-ups. Sound design, invariants hold, real-LCM testing is exactly right. Two before-clean items: (4) give Restore the same rollback Spawn has, and (2) test the OnSpawnCompleted-failure orphan route + Restore error paths.
…ntime on post-create failure (PR #7 review) - Restore now fails early with a clear error if MetaAgentSessionID is missing, rather than emitting an ambiguous "resume nothing" launch command (no stored prompt means a fresh-launch fallback isn't possible). - On a post-runtime-create failure (reopen patch or OnSpawnCompleted), best-effort destroy the newly created runtime (never the workspace, which holds prior work) so we don't strand a live process while parking the session terminal. - Added a test for Restore with a missing agent session id: errors early, touches no workspace/runtime, leaves the session terminal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e rollback (review follow-ups) Adds an injectable OnSpawnCompleted failure to the recording LCM and two tests: - Spawn: when OnSpawnCompleted fails, the seeded record is parked terminal/errored (via OnKillRequested(KillError)) and runtime+workspace are torn down. - Restore: when OnSpawnCompleted fails post-create, the new runtime is destroyed while the workspace is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
harshitsinghbhandari
left a comment
There was a problem hiding this comment.
Re-review — 4c331d9
Verified in a throwaway worktree: gofmt clean, build/vet pass, go test -race green; internal/session coverage 67.5% → 72.7%. Both should-fix items resolved.
- #4 Restore rollback ✅ —
rollbackRuntime(handle)now fires on both the reopen-patch failure and theOnSpawnCompletedfailure, tearing down the freshly-created runtime while preserving the workspace (right call — it may hold uncommitted work). The early missing-agent-session-id guard is a good addition: it bails before any I/O instead of launching an ambiguous "resume nothing." - #2 tests ✅ —
TestSpawn_OnSpawnCompletedFailure_RoutesOrphanToErrored(seeded orphan → terminal/errored, since the store has no delete) andTestRestore_OnSpawnCompletedFailure_RollsBackRuntime(runtime destroyed, workspace preserved) cover exactly the previously-untested paths. - Minors — Send-delegates-to-messenger (by design), Cleanup best-effort runtime-destroy (documented), buildPrompt stub: all acknowledged, agreed as-is.
LGTM — approving. Clean work, and the real-LCM round-trip testing throughout this PR was the right call.
…viewer to worker harness Per PR #197 review feedback: - Reviewer agent posts its review to the PR itself, so remove the ports.PRReviewPoster port, the GitHub review poster, the submit HTTP route + DTO, and the service Submit method (#1, #4, #7). - Trigger spawns the reviewer agent over the worker's worktree with its own review prompt, mirroring the session launch flow (resolve agent by harness -> argv -> runtime.Create) (#8, #9). - Default reviewer harness reuses the worker's harness when supported, falling back to claude-code; reviewer config stays independent of the worker override (#5, #6). - Drop the `ao review` CLI for this PR's scope (#2, #3). Regenerated OpenAPI + TS types. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(review): configurable AO code review backend (V1)
Add per-project configurable code review of a worker's PR. A reviewer
agent runs one-shot over the worker's own worktree and posts its result
to the PR; the worker picks the feedback up through the existing SCM
observer review-nudge path.
- domain: ProjectConfig.reviewers (+ default reviewer harness), Review /
ReviewRun types and verdict/status vocab.
- storage: review + review_run tables (0011), sqlc queries, store methods.
- service/review: rewrite the in-memory stub as a persisted ReviewService
(Trigger/Submit/List) with a reviewer Runner over agent resolver +
runtime; ports.PRReviewPoster implemented on the GitHub adapter.
- http: session-scoped routes POST /sessions/{id}/reviews/trigger,
POST .../submit, GET .../reviews; regenerated OpenAPI + TS types.
- cli: ao review trigger|submit|list.
- frontend: adapt ReviewDashboard to the per-worker reviews API.
Closes #192
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): address review — drop submit/poster/CLI, default reviewer to worker harness
Per PR #197 review feedback:
- Reviewer agent posts its review to the PR itself, so remove the
ports.PRReviewPoster port, the GitHub review poster, the submit HTTP
route + DTO, and the service Submit method (#1, #4, #7).
- Trigger spawns the reviewer agent over the worker's worktree with its
own review prompt, mirroring the session launch flow (resolve agent by
harness -> argv -> runtime.Create) (#8, #9).
- Default reviewer harness reuses the worker's harness when supported,
falling back to claude-code; reviewer config stays independent of the
worker override (#5, #6).
- Drop the `ao review` CLI for this PR's scope (#2, #3).
Regenerated OpenAPI + TS types.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(review): restore ao review submit (records verdict+body in AO)
Per maintainer request, bring back `ao review submit`. AO records the
reviewer's verdict and body on the review_run and marks the pass complete;
it does not post to GitHub — the reviewer agent posts its review to the PR
itself.
- storage: add review_run.body (0011), persist via Insert/UpdateReviewRunResult.
- service: restore Submit (no SCM poster) storing verdict + body.
- http: restore POST /sessions/{id}/reviews/submit + SubmitReviewInput.
- cli: ao review submit [worker] --verdict --body (worker from arg/--session/$AO_REVIEW_WORKER).
- runner: reviewer prompt instructs posting to GitHub and recording via ao review submit.
Regenerated OpenAPI + TS types.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): move reviewer runner to its own package; sharpen prompt
Per PR #197 review:
- Move the concrete reviewer runner out of the service layer into a new
internal/review_runner package (package reviewrunner), beside other
orchestration packages like session_manager. The service keeps only the
Runner interface + RunSpec it depends on; the agent-resolver + runtime
launch flow lives in review_runner.
- Sharpen the reviewer prompt: tell the agent to diff against the PR base,
focus on high-confidence findings, post via `gh pr review`, and record
the result with `ao review submit`; review-only (no commits/edits).
- Add unit tests for the runner.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): simplify review_run schema; provider-agnostic reviewer prompt
Per PR #197 review:
- review_run: status default 'running' (drop 'pending'), drop CHECK
constraints on status/verdict, drop the updated_at column and the
session/iteration index. Propagated through queries, domain, store,
service, and tests.
- Reviewer prompt no longer hardcodes GitHub/gh commands — it instructs the
agent to use whatever review tooling the provider offers, keeping the
flow extensible across SCM providers.
Regenerated sqlc + OpenAPI/TS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): launch reviewer before persisting the run
Trigger now spawns the reviewer agent first and then writes the review_run
with a status derived from the launch outcome (running on success, failed
if it never started), instead of inserting a running row and correcting it
to failed afterwards.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): pluggable reviewer registry distinct from worker harnesses
Reviewers are now their own pluggable adapter set, separate from the worker
agent registry — adding a reviewer (claude-code today, greptile tomorrow) is
a one-line registration that does not widen the worker harness vocabulary,
and a worker harness does not automatically become a valid reviewer.
- domain.ReviewerHarness: a distinct vocabulary (AllReviewerHarnesses) with
its own IsKnown; ReviewerConfig/Review/ReviewRun use it. ResolveReviewerHarness
reuses the worker harness only when it is itself a supported reviewer, else
falls back to claude-code.
- ports.Reviewer: a reviewer-specific contract (ReviewCommand → argv + env)
that models one-shot / non-prompt CLIs natively instead of forcing every
reviewer through the worker's interactive GetLaunchCommand(Prompt:...).
- internal/adapters/reviewer: a separate registry + resolver (mirrors the
worker agent registry) with the claude-code reviewer adapter, which owns the
review prompt and reuses the worker claude-code launch construction.
- review_runner resolves via the reviewer registry (not the worker
AgentResolver) and merges AO_REVIEW_WORKER into the adapter's env.
- daemon wires the reviewer resolver. Registry/domain parity is test-enforced.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(review): cover run-scoped reviewer submit
* fix(api): update generated review submit schema
* refactor(review): split core engine (internal/review) from API service
Move the review orchestration (Trigger/Submit/List, run-id generation,
deps, RunSpec/Runner, sentinels) into a transport-independent core package
internal/review (Engine). internal/service/review is now a thin API-flow
boundary: the controller-facing Manager interface + a Service that delegates
to the engine + error re-exports.
This keeps the service layer to API concerns and lets the same engine back a
future in-process CLI trigger without going through HTTP. review_runner now
depends on the core package; daemon builds the engine and wraps it in the
service. No API/schema changes.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(review): commit-aware trigger, reviewer handle for UI, no env vars
Reworks the review trigger lifecycle and drops env-based coupling:
- review_run gains target_sha (the reviewed commit) and drops iteration.
A repeat trigger for the same PR head short-circuits to the existing run.
- review gains reviewer_handle_id: the live reviewer pane's runtime handle,
reused across passes and exposed in the reviews API so the UI can attach
its terminal over /mux.
- Trigger flow: if a live reviewer pane exists and a new commit arrived,
message it to re-review; otherwise spawn a fresh reviewer. The run is
recorded only after the reviewer is launched.
- No environment variables: the reviewer adapter embeds the explicit
`ao review submit --session <w> --run <id>` command in the spawn prompt
and the re-review message. CLI submit requires --run/--session (no env
fallbacks).
- Merge review_runner into internal/review as a Launcher (spawn/notify/alive).
- Trigger returns 201 for a new pass, 200 when reusing an existing run.
Regenerated sqlc + OpenAPI/TS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): author the reviewer prompt centrally, not in the adapter
Mirror the worker model (session_manager builds the prompt; adapters just
place it via LaunchConfig.Prompt). The reviewer prompt now lives in
internal/review/prompt.go and is passed through ports.ReviewInvocation.Prompt;
the claude-code reviewer adapter just feeds inv.Prompt to its launch command
and returns it as the re-review message. One-shot CLI reviewers may ignore it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* refactor(review): split reviewer prompt into system+task, mirroring buildSpawnTexts
Mirror session_manager.buildSpawnTexts for the reviewer: a standing role goes
in the system prompt, the per-pass task (PR/commit + exact `ao review submit`
command) goes in the user prompt. internal/review/prompt.go now returns
(prompt, systemPrompt); both flow through ports.ReviewInvocation and the
claude-code adapter places them via LaunchConfig{Prompt, SystemPrompt}. The
re-review message reuses the per-pass prompt (role already established in the
running pane).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Vaibhaav <user@example.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Summary
Implements
ports.SessionManager(the explicit-mutation half of the LCM+SM lane) inbackend/internal/session/, against fakes for the outbound ports. The SM drives the Runtime/Agent/Workspace plugins, seeds the initial lifecycle record, and routes mutation outcomes to the LCM (OnSpawnCompleted/OnKillRequested). It never derives observed state and is the single producer of the derived display status (attached on read viaDeriveLegacyStatus, never persisted).Base is
feat/lcm-sm-contracts(the integration branch), notmain.Pipelines
Workspace.Create → Runtime.Create(envAO_SESSION_ID/AO_PROJECT_ID/AO_ISSUE_IDlayered over the agent env)→ Seed → LCM.OnSpawnCompleted. Eager rollback of completed steps on any failure (the record is seeded late, so an unseeded orphan would be invisible to Cleanup). IfOnSpawnCompletedfails after seeding, best-effort routes the orphan to errored viaOnKillRequested(KillError)since the store has no delete.LCM.OnKillRequestedfirst→ Runtime.Destroy → Workspace.Destroy, honoring the worktree-remove safety — a refusal (path still registered after prune ⇒ may hold uncommitted work) surfaces as an error withWorkspaceFreed:falseand is never forced.AgentMessenger.not_started/spawn_requested, PRcleared_on_restore) → relaunch viaAgent.GetRestoreCommand→OnSpawnCompleted.CleanupResult{Cleaned, Skipped}.OnSpawnCompletedrequires a pre-seeded record, butLifecycleStorehad no way to create one with identity, and single-id reads (Get/Kill) had no record-with-identity accessor (Loadis lifecycle-only). Added two methods, discussed and approved with the lane owner but flagged here since they touch the persistence contract:Seed(ctx, SessionRecord) error— explicit create-with-identity; errors if the id already exists (re-seed on Restore goes throughPatchLifecycle).Get(ctx, SessionID) (SessionRecord, bool, error)— single full record read (symmetric withList).The lifecycle package's test fake was updated to satisfy the wider interface; LCM tests remain green. No other LCM / decide-core / contract changes.
Out of scope / flags
preflightappears in the spec but no outbound port exposes it → treated as out of scope.Runtime.Destroy(the agent runs inside the runtime).Test plan
Tests route through the real LCM
Manager(wrapped to record call order):OnSpawnCompletedflips runtime→alive, AO_* env wired, handles in metadata, statusspawningOnSpawnCompletedOnKillRequested<Runtime.Destroy<Workspace.Destroy) + terminalkilledstatusWorkspaceFreed:false, not force-deleted)Getnot-foundGates:
gofmt -l .empty ·go build ./...·go vet ./...·go test -race ./...— all green.🤖 Generated with Claude Code