feat(session): Session Manager — spawn/kill/list/get/send/restore/cleanup by harshitsinghbhandari · Pull Request #7 · aoagents/ReverbCode

harshitsinghbhandari · 2026-05-27T08:14:08Z

Summary

Implements ports.SessionManager (the explicit-mutation half of the LCM+SM lane) in backend/internal/session/, against fakes for the outbound ports. The SM drives the Runtime/Agent/Workspace plugins, seeds the initial lifecycle record, and routes mutation outcomes to the LCM (OnSpawnCompleted / OnKillRequested). It never derives observed state and is the single producer of the derived display status (attached on read via DeriveLegacyStatus, never persisted).

Base is feat/lcm-sm-contracts (the integration branch), not main.

Pipelines

Spawn: Workspace.Create → Runtime.Create (env AO_SESSION_ID/AO_PROJECT_ID/AO_ISSUE_ID layered over the agent env) → Seed → LCM.OnSpawnCompleted. Eager rollback of completed steps on any failure (the record is seeded late, so an unseeded orphan would be invisible to Cleanup). If OnSpawnCompleted fails after seeding, best-effort routes the orphan to errored via OnKillRequested(KillError) since the store has no delete.
Kill: LCM.OnKillRequested first → Runtime.Destroy → Workspace.Destroy, honoring the worktree-remove safety — a refusal (path still registered after prune ⇒ may hold uncommitted work) surfaces as an error with WorkspaceFreed:false and is never forced.
List/Get: read records, attach derived status; single producer, never persisted.
Send: via AgentMessenger.
Restore: fallible I/O first (workspace restore + runtime create, never destroys the worktree on failure) → re-seed/reopen lifecycle (not_started/spawn_requested, PR cleared_on_restore) → relaunch via Agent.GetRestoreCommand → OnSpawnCompleted.
Cleanup: reclaims terminal sessions' workspaces, skips paths holding uncommitted work → CleanupResult{Cleaned, Skipped}.

⚠️ Store-contract additions (need @tom's review)

OnSpawnCompleted requires a pre-seeded record, but LifecycleStore had no way to create one with identity, and single-id reads (Get/Kill) had no record-with-identity accessor (Load is lifecycle-only). Added two methods, discussed and approved with the lane owner but flagged here since they touch the persistence contract:

Seed(ctx, SessionRecord) error — explicit create-with-identity; errors if the id already exists (re-seed on Restore goes through PatchLifecycle).
Get(ctx, SessionID) (SessionRecord, bool, error) — single full record read (symmetric with List).

The lifecycle package's test fake was updated to satisfy the wider interface; LCM tests remain green. No other LCM / decide-core / contract changes.

Out of scope / flags

preflight appears in the spec but no outbound port exposes it → treated as out of scope.
"Agent stop" in the kill pipeline has no port method → subsumed by Runtime.Destroy (the agent runs inside the runtime).
Real tmux/claude-code/worktree adapters, full 3-layer prompt assembly, orchestrator-as-session, real CDC — all out of scope per the lane plan.

Test plan

Tests route through the real LCM Manager (wrapped to record call order):

Spawn happy path — record seeded with identity, OnSpawnCompleted flips runtime→alive, AO_* env wired, handles in metadata, status spawning
Spawn runtime-create failure — eager workspace rollback, no record seeded, no OnSpawnCompleted
Kill ordering (OnKillRequested < Runtime.Destroy < Workspace.Destroy) + terminal killed status
Kill worktree-remove refusal surfaced (WorkspaceFreed:false, not force-deleted)
List/Get derive status (table) + Get not-found
Send routes to messenger
Restore relaunch via resume command + reopened lifecycle
Cleanup skips uncommitted-work worktrees, ignores live sessions

Gates: gofmt -l . empty · go build ./... · go vet ./... · go test -race ./... — all green.

🤖 Generated with Claude Code

…store/cleanup) Implements ports.SessionManager against fakes for the outbound ports. The SM is the explicit-mutation half of the lane: it drives Runtime/Agent/Workspace, seeds the initial lifecycle, and routes outcomes to the LCM (OnSpawnCompleted / OnKillRequested). It never derives observed state and is the single producer of the derived display status (attached on read, never persisted). - Spawn: Workspace.Create -> Runtime.Create (AO_* identity env) -> Seed -> OnSpawnCompleted, with eager rollback of completed steps on failure. - Kill: OnKillRequested first -> Runtime.Destroy -> Workspace.Destroy, honoring the worktree-remove safety (refusal surfaced, never forced). - List/Get: derive status via DeriveLegacyStatus. Send: via AgentMessenger. Restore: re-seed (reopen) + relaunch via GetRestoreCommand. Cleanup: reclaim terminal sessions, skip worktrees holding uncommitted work. Store-contract additions (co-owned with Tom's persistence layer, flagged for review): LifecycleStore.Seed (explicit create-with-identity; OnSpawnCompleted requires a seeded record) and LifecycleStore.Get (single record-with-identity read; Load is lifecycle-only). Lifecycle test fake updated to satisfy both. Tests route through the real LCM Manager (wrapped to record call order). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Implements a new ports.SessionManager in backend/internal/session/ to own explicit session mutations (spawn/kill/send/restore/cleanup) while delegating canonical lifecycle transitions to the existing LCM, and deriving the display Status only on reads.

Changes:

Added session.Manager implementing the SM pipelines for Spawn/Kill/List/Get/Send/Restore/Cleanup with rollback behavior on spawn failures.
Extended ports.LifecycleStore with Seed (create-with-identity) and Get (single record read) to support SM needs beyond lifecycle-only Load.
Added an in-package fake harness and tests that exercise SM behavior against the real lifecycle manager plus faked outbound ports.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
backend/internal/session/manager.go	New Session Manager implementation wiring outbound ports, store seeding, LCM notifications, and read-model derivation.
backend/internal/session/manager_test.go	SM tests covering spawn/kill/list/get/send/restore/cleanup flows using a harness.
backend/internal/session/fakes_test.go	Fakes for store/runtime/agent/workspace/messenger + recording LCM wrapper to assert pipeline ordering.
backend/internal/ports/outbound.go	Adds `Seed` and `Get` to `LifecycleStore` contract with explanatory docs.
backend/internal/lifecycle/fakes_test.go	Updates lifecycle test fake store to satisfy the expanded `LifecycleStore` interface.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

harshitsinghbhandari

Review — Session Manager (reviewed via subagent + independent verification)

Verified in a throwaway worktree: gofmt clean, build/vet pass, go test -race green; internal/session at 67.5% coverage. Spec-faithful and the invariants hold — display status is derived only on read (toSession) and never persisted, kills route through LCM.OnKillRequested, and the tests exercise the real lifecycle.Manager (not a fake LCM) for the round-trips, which is exactly right. Two should-fix items before this is clean; the rest is minor/doc-level.

1. `LifecycleStore.Seed` + `Get` — clean ✅

Seed taking a full SessionRecord and rejecting an existing id (create-only), with Restore routed through PatchLifecycle instead, cleanly separates create from reopen and avoids a double-seed footgun. Get is not redundant with Load (lifecycle-only, hot path) or List (project-scoped) — the SM needs identity by id for teardown/read-model. Both are ordinary CRUD for a real store. Minor: a one-line doc noting Load = hot-path lifecycle-only vs Get = full record would help Tom.

2. Spawn rollback + orphan handling — logic right, path untested (should-fix)

Eager rollback is correct (late seed; each step unwinds the prior). The OnSpawnCompleted-failure orphan route (seed + runtime up, then route to terminal-errored via OnKillRequested(KillError) + teardown) is the right call — but there is no test for it, and it's the most subtle path flagged. Add a test (fake LCM whose OnSpawnCompleted errors) asserting terminal-errored + OnKillRequested(KillError) fired + runtime/workspace torn down. Also untested: workspace-create-failure rollback, and Seed duplicate-id rejection.

3. Kill ordering + worktree-remove refusal — correct & tested ✅

OnKillRequested → Runtime.Destroy → Workspace.Destroy, asserted by call-order indices; refusal surfaces as an error with WorkspaceFreed=false and the path is never force-deleted. Note: the spec's explicit "Agent stop" step is folded into Runtime.Destroy (agent runs inside the runtime) — defensible, but call it out in the PR body so the Agent port's stop semantics aren't assumed elsewhere. Minor: Kill reconstructs handles from metadata; a seeded-but-never-completed session yields empty handle/path to Destroy — should be a safe no-op but isn't guarded/tested.

4. Restore — no rollback → runtime leak + stranded state (should-fix)

Restore does Workspace.Restore → Runtime.Create → PatchLifecycle(reopen) → OnSpawnCompleted, but on a failure at the reopen or OnSpawnCompleted step it returns the error without destroying the runtime it just created → orphan runtime. Worse, if OnSpawnCompleted fails after the reopen patch landed, the record is left not_started (display: spawning) with a live runtime and no recovery path. Spawn handles exactly this; Restore should mirror it (roll back the runtime on reopen/OnSpawnCompleted failure). No Restore failure path is tested. The reopen itself (PR → none/cleared_on_restore, session → not_started/spawn_requested, via PatchLifecycle not Seed) is correct and tested on the happy path.

5. Tests against the real LCM — yes ✅

recordingLCM wraps the real lifecycle.New(...) and only logs call order, so OnSpawnCompleted/OnKillRequested run the real load→decide→persist pipeline and the kill test asserts the LCM produced terminated/manually_killed. Coverage gaps are the error paths above (orphan route, all Restore-failure paths) — that's what's holding 67.5%.

Minor

Send has no existence/terminal guard — forwards blindly to AgentMessenger. Probably fine (messenger owns busy-detect/verify), but confirm it's intentional.
Cleanup drops a runtime-destroy error silently even when the workspace is later reclaimed — fine for terminal sessions, undocumented.
buildPrompt is a documented stub for the 3-layer assembly — fine for this split.

Recommendation: Approve-with-follow-ups. Sound design, invariants hold, real-LCM testing is exactly right. Two before-clean items: (4) give Restore the same rollback Spawn has, and (2) test the OnSpawnCompleted-failure orphan route + Restore error paths.

…ntime on post-create failure (PR #7 review) - Restore now fails early with a clear error if MetaAgentSessionID is missing, rather than emitting an ambiguous "resume nothing" launch command (no stored prompt means a fresh-launch fallback isn't possible). - On a post-runtime-create failure (reopen patch or OnSpawnCompleted), best-effort destroy the newly created runtime (never the workspace, which holds prior work) so we don't strand a live process while parking the session terminal. - Added a test for Restore with a missing agent session id: errors early, touches no workspace/runtime, leaves the session terminal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…e rollback (review follow-ups) Adds an injectable OnSpawnCompleted failure to the recording LCM and two tests: - Spawn: when OnSpawnCompleted fails, the seeded record is parked terminal/errored (via OnKillRequested(KillError)) and runtime+workspace are torn down. - Restore: when OnSpawnCompleted fails post-create, the new runtime is destroyed while the workspace is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

harshitsinghbhandari

Re-review — `4c331d9`

Verified in a throwaway worktree: gofmt clean, build/vet pass, go test -race green; internal/session coverage 67.5% → 72.7%. Both should-fix items resolved.

#4 Restore rollback ✅ — rollbackRuntime(handle) now fires on both the reopen-patch failure and the OnSpawnCompleted failure, tearing down the freshly-created runtime while preserving the workspace (right call — it may hold uncommitted work). The early missing-agent-session-id guard is a good addition: it bails before any I/O instead of launching an ambiguous "resume nothing."
#2 tests ✅ — TestSpawn_OnSpawnCompletedFailure_RoutesOrphanToErrored (seeded orphan → terminal/errored, since the store has no delete) and TestRestore_OnSpawnCompletedFailure_RollsBackRuntime (runtime destroyed, workspace preserved) cover exactly the previously-untested paths.
Minors — Send-delegates-to-messenger (by design), Cleanup best-effort runtime-destroy (documented), buildPrompt stub: all acknowledged, agreed as-is.

LGTM — approving. Clean work, and the real-LCM round-trip testing throughout this PR was the right call.

…viewer to worker harness Per PR #197 review feedback: - Reviewer agent posts its review to the PR itself, so remove the ports.PRReviewPoster port, the GitHub review poster, the submit HTTP route + DTO, and the service Submit method (#1, #4, #7). - Trigger spawns the reviewer agent over the worker's worktree with its own review prompt, mirroring the session launch flow (resolve agent by harness -> argv -> runtime.Create) (#8, #9). - Default reviewer harness reuses the worker's harness when supported, falling back to claude-code; reviewer config stays independent of the worker override (#5, #6). - Drop the `ao review` CLI for this PR's scope (#2, #3). Regenerated OpenAPI + TS types. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(review): configurable AO code review backend (V1) Add per-project configurable code review of a worker's PR. A reviewer agent runs one-shot over the worker's own worktree and posts its result to the PR; the worker picks the feedback up through the existing SCM observer review-nudge path. - domain: ProjectConfig.reviewers (+ default reviewer harness), Review / ReviewRun types and verdict/status vocab. - storage: review + review_run tables (0011), sqlc queries, store methods. - service/review: rewrite the in-memory stub as a persisted ReviewService (Trigger/Submit/List) with a reviewer Runner over agent resolver + runtime; ports.PRReviewPoster implemented on the GitHub adapter. - http: session-scoped routes POST /sessions/{id}/reviews/trigger, POST .../submit, GET .../reviews; regenerated OpenAPI + TS types. - cli: ao review trigger|submit|list. - frontend: adapt ReviewDashboard to the per-worker reviews API. Closes #192 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): address review — drop submit/poster/CLI, default reviewer to worker harness Per PR #197 review feedback: - Reviewer agent posts its review to the PR itself, so remove the ports.PRReviewPoster port, the GitHub review poster, the submit HTTP route + DTO, and the service Submit method (#1, #4, #7). - Trigger spawns the reviewer agent over the worker's worktree with its own review prompt, mirroring the session launch flow (resolve agent by harness -> argv -> runtime.Create) (#8, #9). - Default reviewer harness reuses the worker's harness when supported, falling back to claude-code; reviewer config stays independent of the worker override (#5, #6). - Drop the `ao review` CLI for this PR's scope (#2, #3). Regenerated OpenAPI + TS types. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(review): restore ao review submit (records verdict+body in AO) Per maintainer request, bring back `ao review submit`. AO records the reviewer's verdict and body on the review_run and marks the pass complete; it does not post to GitHub — the reviewer agent posts its review to the PR itself. - storage: add review_run.body (0011), persist via Insert/UpdateReviewRunResult. - service: restore Submit (no SCM poster) storing verdict + body. - http: restore POST /sessions/{id}/reviews/submit + SubmitReviewInput. - cli: ao review submit [worker] --verdict --body (worker from arg/--session/$AO_REVIEW_WORKER). - runner: reviewer prompt instructs posting to GitHub and recording via ao review submit. Regenerated OpenAPI + TS types. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): move reviewer runner to its own package; sharpen prompt Per PR #197 review: - Move the concrete reviewer runner out of the service layer into a new internal/review_runner package (package reviewrunner), beside other orchestration packages like session_manager. The service keeps only the Runner interface + RunSpec it depends on; the agent-resolver + runtime launch flow lives in review_runner. - Sharpen the reviewer prompt: tell the agent to diff against the PR base, focus on high-confidence findings, post via `gh pr review`, and record the result with `ao review submit`; review-only (no commits/edits). - Add unit tests for the runner. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): simplify review_run schema; provider-agnostic reviewer prompt Per PR #197 review: - review_run: status default 'running' (drop 'pending'), drop CHECK constraints on status/verdict, drop the updated_at column and the session/iteration index. Propagated through queries, domain, store, service, and tests. - Reviewer prompt no longer hardcodes GitHub/gh commands — it instructs the agent to use whatever review tooling the provider offers, keeping the flow extensible across SCM providers. Regenerated sqlc + OpenAPI/TS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): launch reviewer before persisting the run Trigger now spawns the reviewer agent first and then writes the review_run with a status derived from the launch outcome (running on success, failed if it never started), instead of inserting a running row and correcting it to failed afterwards. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): pluggable reviewer registry distinct from worker harnesses Reviewers are now their own pluggable adapter set, separate from the worker agent registry — adding a reviewer (claude-code today, greptile tomorrow) is a one-line registration that does not widen the worker harness vocabulary, and a worker harness does not automatically become a valid reviewer. - domain.ReviewerHarness: a distinct vocabulary (AllReviewerHarnesses) with its own IsKnown; ReviewerConfig/Review/ReviewRun use it. ResolveReviewerHarness reuses the worker harness only when it is itself a supported reviewer, else falls back to claude-code. - ports.Reviewer: a reviewer-specific contract (ReviewCommand → argv + env) that models one-shot / non-prompt CLIs natively instead of forcing every reviewer through the worker's interactive GetLaunchCommand(Prompt:...). - internal/adapters/reviewer: a separate registry + resolver (mirrors the worker agent registry) with the claude-code reviewer adapter, which owns the review prompt and reuses the worker claude-code launch construction. - review_runner resolves via the reviewer registry (not the worker AgentResolver) and merges AO_REVIEW_WORKER into the adapter's env. - daemon wires the reviewer resolver. Registry/domain parity is test-enforced. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(review): cover run-scoped reviewer submit * fix(api): update generated review submit schema * refactor(review): split core engine (internal/review) from API service Move the review orchestration (Trigger/Submit/List, run-id generation, deps, RunSpec/Runner, sentinels) into a transport-independent core package internal/review (Engine). internal/service/review is now a thin API-flow boundary: the controller-facing Manager interface + a Service that delegates to the engine + error re-exports. This keeps the service layer to API concerns and lets the same engine back a future in-process CLI trigger without going through HTTP. review_runner now depends on the core package; daemon builds the engine and wraps it in the service. No API/schema changes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(review): commit-aware trigger, reviewer handle for UI, no env vars Reworks the review trigger lifecycle and drops env-based coupling: - review_run gains target_sha (the reviewed commit) and drops iteration. A repeat trigger for the same PR head short-circuits to the existing run. - review gains reviewer_handle_id: the live reviewer pane's runtime handle, reused across passes and exposed in the reviews API so the UI can attach its terminal over /mux. - Trigger flow: if a live reviewer pane exists and a new commit arrived, message it to re-review; otherwise spawn a fresh reviewer. The run is recorded only after the reviewer is launched. - No environment variables: the reviewer adapter embeds the explicit `ao review submit --session <w> --run <id>` command in the spawn prompt and the re-review message. CLI submit requires --run/--session (no env fallbacks). - Merge review_runner into internal/review as a Launcher (spawn/notify/alive). - Trigger returns 201 for a new pass, 200 when reusing an existing run. Regenerated sqlc + OpenAPI/TS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): author the reviewer prompt centrally, not in the adapter Mirror the worker model (session_manager builds the prompt; adapters just place it via LaunchConfig.Prompt). The reviewer prompt now lives in internal/review/prompt.go and is passed through ports.ReviewInvocation.Prompt; the claude-code reviewer adapter just feeds inv.Prompt to its launch command and returns it as the re-review message. One-shot CLI reviewers may ignore it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): split reviewer prompt into system+task, mirroring buildSpawnTexts Mirror session_manager.buildSpawnTexts for the reviewer: a standing role goes in the system prompt, the per-pass task (PR/commit + exact `ao review submit` command) goes in the user prompt. internal/review/prompt.go now returns (prompt, systemPrompt); both flow through ports.ReviewInvocation and the claude-code adapter places them via LaunchConfig{Prompt, SystemPrompt}. The re-review message reuses the per-pass prompt (role already established in the running pane). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: Vaibhaav <user@example.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

harshitsinghbhandari requested a review from Copilot May 27, 2026 08:14

Copilot started reviewing on behalf of harshitsinghbhandari May 27, 2026 08:15 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

Comment thread backend/internal/session/manager.go

Comment thread backend/internal/session/manager.go

Comment thread backend/internal/session/manager_test.go

harshitsinghbhandari commented May 27, 2026

View reviewed changes

harshitsinghbhandari and others added 2 commits May 27, 2026 13:50

harshitsinghbhandari commented May 27, 2026

View reviewed changes

harshitsinghbhandari merged commit 6cf7c04 into feat/lcm-sm-contracts May 27, 2026
2 checks passed

This was referenced May 27, 2026

Tracking: Lifecycle Manager + Session Manager lane #3

Closed

feat(backend): Lifecycle Manager + Session Manager lane #2

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(session): Session Manager — spawn/kill/list/get/send/restore/cleanup#7

feat(session): Session Manager — spawn/kill/list/get/send/restore/cleanup#7
harshitsinghbhandari merged 3 commits into
feat/lcm-sm-contractsfrom
session/aa-5

harshitsinghbhandari commented May 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

harshitsinghbhandari left a comment

Uh oh!

harshitsinghbhandari left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

harshitsinghbhandari commented May 27, 2026

Summary

Pipelines

⚠️ Store-contract additions (need @tom's review)

Out of scope / flags

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

harshitsinghbhandari left a comment

Choose a reason for hiding this comment

Review — Session Manager (reviewed via subagent + independent verification)

1. LifecycleStore.Seed + Get — clean ✅

2. Spawn rollback + orphan handling — logic right, path untested (should-fix)

3. Kill ordering + worktree-remove refusal — correct & tested ✅

4. Restore — no rollback → runtime leak + stranded state (should-fix)

5. Tests against the real LCM — yes ✅

Minor

Uh oh!

harshitsinghbhandari left a comment

Choose a reason for hiding this comment

Re-review — 4c331d9

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `LifecycleStore.Seed` + `Get` — clean ✅

Re-review — `4c331d9`