diff --git a/.gitignore b/.gitignore
index 4bbdb4e..f8d8e97 100644
--- a/.gitignore
+++ b/.gitignore
@@ -17,3 +17,4 @@ server/bin/
 !server/internal/web/dist/
 server/internal/web/dist/*
 !server/internal/web/dist/.gitkeep
+.worktrees
diff --git a/docs/ideation/2026-06-08-single-deuce-agent-ideation.md b/docs/ideation/2026-06-08-single-deuce-agent-ideation.md
new file mode 100644
index 0000000..a065964
--- /dev/null
+++ b/docs/ideation/2026-06-08-single-deuce-agent-ideation.md
@@ -0,0 +1,122 @@
+---
+date: 2026-06-08
+topic: single-deuce-agent
+focus: Collapse the 5 agent roles into one "Deuce" agent that supports skills + subagents; can the Pi harness support it?
+mode: repo-grounded
+---
+
+# Ideation: One "Deuce" Agent With Skills + Subagents
+
+## Verdict
+
+**Yes — Pi can support a single "Deuce" modulated by skills + subagents, and the design is closer than it looks.** Under the Pi harness, an agent's `role` and `system_prompt` are fetched from the DB but **never forwarded to Pi** — Pi already runs as one generic coding agent regardless of which of the 5 rows triggered it. The 5 "agents" are largely a UI fiction at the execution layer, so the collapse removes a distinction the harness already ignores.
+
+> **Validated 2026-06-08** (spike against real `pi` 0.74.2; see `docs/solutions/architecture-patterns/pi-loads-agent-skills-standard-in-rpc-mode.md`). The compound-engineering plugin was exported to Pi's agent layout (`server/tmp/agent/`) and tested:
+>
+> - **Skills load in `--mode rpc`.** All 36 exported skills appeared via the `get_commands` RPC, each invocable as `/skill:<name>`. `--mode rpc` is just an output format, orthogonal to skill loading.
+> - **Pi implements the [Agent Skills standard](https://agentskills.io/specification)** (same as Claude Code) and auto-discovers from `~/.pi/agent/skills/` — the sibling of where Deuce already provisions `ask-user.ts`. So provisioning is the existing `InstallPiExtension` mechanism generalized to a tree.
+> - **Deterministic invocation exists:** sending a `prompt` of `/skill:ce-commit …` expands the skill before processing — no reliance on the model choosing to `read` it. This *replaces* Idea 1's prompt-prepend hack.
+> - **Correction to the constraint below:** `--system-prompt` / `--append-system-prompt` *are* real launch flags, so a per-session "Deuce" identity prompt is available at launch. There is still no *mid-session per-task* override via an RPC command — but per-task specialization rides `/skill:` expansion, which is cleaner.
+> - **Still open:** live model triggering (needs an API key) and subagents (the `agents/*.md` personas run via the `pi-subagents` extension, which forks child Pi processes that bypass Deuce's `pirun.Key` supervisor + WS attribution — the Survivor 3 design question).
+
+The one real harness constraint (as originally framed): Pi `--mode rpc` has no per-session *system-prompt* override — but the task `Prompt` does reach Pi, which is the seam skills ride (Idea 1). *(See the validation note above — launch-time `--append-system-prompt` and `/skill:` expansion both turned out to be available, softening this constraint.)*
+
+## Grounding Context (Codebase)
+
+- **Agent model:** `agents` table (`id, name, role(free-text, never enum), color, color_muted, provider, model, description, system_prompt, deleted_at`), 5 hardcoded seed rows (Coder/Reviewer/Planner/Tester/Designer). `session_agents` join table `(session_id, agent_id, status, claude_session_id, pi_session_id)` allows multiple agents per session. Frontend `AgentRole = string` (deliberately un-enumerated).
+- **Role is already decoupled from execution.** Under Pi, `role`/`system_prompt` are fetched (`messages.go`) but never forwarded. `EnqueueParams` carries only `{SessionID, AgentID, RequestedBy, AnchorMessageID, Prompt, WorkspaceID}`. The `system_prompt` is silently dropped on the Pi path — a live design-debt bug (only the legacy `claude` executor consumed `--append-system-prompt`).
+- **@mention flow:** frontend `ChatView.tsx` regex-parses `@Name`, matches `session.agents` by name → UUID → `sendMessage({content, mentions:[uuid]})`. Backend `messages.go` loops mentions and enqueues to the Pi runtime. A near-miss name (`@coder`) silently no-ops.
+- **Pi harness topology:** Go drives a persistent `pi --mode rpc` process inside each session's DevPod container via `devpod ssh`. Process granularity is `pirun.Key{SessionID, AgentID}` — one process per (session, agent) pair, serial queue per key. JSONL protocol; custom `ask-user.ts` extension provisioned to `~/.pi/agent/extensions/` via `pi.registerTool()` is the only extension today and the natural skills seam.
+- **No "skills" concept exists** in the codebase. **Pi has no native subagent dispatch** in RPC mode; community `@tintinweb/pi-subagents` spawns child Pi instances (foreground works in RPC, background/scheduled degraded). Subagents today = the server spawning multiple Pi processes under separate agent rows.
+- **Coupling points for a collapse:** DB schema (agents/session_agents/seed), `pirun.Key`, `EnqueueParams.AgentID`, the `messages.go` mention loop, frontend mention parse + per-agent message border colors, `SummaryPanel`/`AgentTaskCard`/`AgentThreadDrawer`/`AgentSettingsDialog`, WS payloads carrying `agentId`.
+
+### External context
+
+- Causal-independence is the criterion for safe role→single-agent collapse (arXiv:2601.04748): safe when roles don't feed back mid-session; 15–40% accuracy drop when downstream depends on upstream. In Deuce the human drives next steps → roles largely decoupled.
+- Skill shadowing (arXiv:2605.24050): flat libraries degrade up to 21% at scale from selection failure; fix is hierarchical category→skill routing.
+- Claude Code: Skills = inline file-based bounded capability bundles; Subagents (Task tool) = spawned child sessions w/ own context + restricted tools for parallel/isolated work. OpenAI/CrewAI/AutoGen lesson: start single-agent + tools, split only at real limits; over-splitting → routing failures, duplicated instructions, drift.
+
+### Past learnings (docs/solutions/)
+
+- No direct prior art on agent-role consolidation, the Pi RPC harness, or skills/subagents — treat as greenfield, strong `/ce-compound` candidate once the design lands.
+- Mirror the "one identity model, one user table, no second source of truth" precedent (`embedded-ssh-proxy-for-vscode-remote.md`) when attributing Deuce + subagents.
+- Re-run the route-authorization audit (`broadening-resource-visibility-requires-per-route-authorization-audit.md`) for any new "invoke skill" / "spawn subagent" / "set active agent" endpoint; `UpdateSessionAgents`/`StopAgent` are already write/live-gated.
+
+## Topic Axes
+
+- Agent identity & data model
+- Skills (capability units)
+- Subagents (parallel dispatch)
+- Invocation & UX
+- Migration & harness fit
+
+## Ranked Ideas
+
+### 1. Skill specialization rides the prompt channel — the keystone that ships on Pi today
+**Description:** Since Pi RPC won't honor a system-prompt override but `EnqueueParams.Prompt` *does* reach Pi, a "skill" becomes a preparation block prepended to the task prompt (and/or a `pi.registerTool()` extension, the same seam `ask-user.ts` uses). "Behave like a reviewer" stops being a dropped DB column and becomes injected instruction Pi actually executes.
+**Axis:** Skills
+**Basis:** `direct:` — `system_prompt` is fetched (`messages.go`) but silently dropped on the Pi path; `EnqueueParams.Prompt` is the one channel that reaches Pi; `ask-user.ts` proves the `registerTool` seam works.
+**Rationale:** Simultaneously pays down the live "prompt does nothing" debt and is the load-bearing primitive every skill depends on. Without it, "skills" have nowhere to attach.
+**Downsides:** Prompt-prepend is weaker than a true system prompt (model can drift); registerTool path needs per-container extension provisioning.
+**Confidence:** 90% **Complexity:** Low–Medium **Status:** Unexplored
+
+### 2. Strangler collapse: 5 rows → skill-presets → one runtime, then delete
+**Description:** Don't big-bang the schema. Keep the 5 `agents` rows as thin UI presets that all resolve to one `Key{SessionID}` runtime "Deuce"; convert each role's prompt into a skill; let `@Coder` desugar to "Deuce + coder skill." Collapse the rows later once telemetry shows nobody depends on the distinction.
+**Axis:** Agent identity & data model (+ migration)
+**Basis:** `reasoned:` + `direct:` — grounding lists heavy coupling (DB schema, `pirun.Key`, `EnqueueParams.AgentID`, mention loop, WS `agentId` payloads, sidebar/colors); a strangler keeps every `agentId` contract intact while redefining what it resolves to, so historical sessions still render and rollback stays cheap.
+**Rationale:** Turns a destructive migration into an additive, reversible one; the 5 proven role prompts become the seed skill library instead of being discarded.
+**Downsides:** Temporary two-model period (rows that lie about being agents); telemetry needed to know when to cut.
+**Confidence:** 85% **Complexity:** Medium **Status:** Unexplored
+
+### 3. One Pi process per session; subagents as the parallelism escape hatch
+**Description:** Drop `AgentID` from `pirun.Key` → one persistent `pi --mode rpc` per session. The concurrency you get today from 5 keyed processes relocates downward to subagents spawned on demand (foreground `@tintinweb/pi-subagents`, or the server spawning child Pi processes — which is literally what the 5-agent model already does). "Deuce" owns ordering/continuity of the channel; subagents are bounded parallel workers that report back as Deuce.
+**Axis:** Subagents (parallel dispatch)
+**Basis:** `direct:` + `external:` — today's concurrency comes from per-(session,agent) processes with serial per-key queues; Pi has no native subagent dispatch but the multi-process machinery already exists; Unix fork/exec (identity in parent, specialization in child) and Anthropic's "subagents are a performance tier, not a role taxonomy."
+**Rationale:** The hardest part of subagents (lifecycle, isolation, message routing) is already built by the 5-agent model — collapsing roles frees that machinery to serve as the real parallelism tier.
+**Downsides:** A single per-session process serializes one user's turns unless subagents are wired in; background/scheduled Pi subagents are documented as degraded in RPC.
+**Confidence:** 75% **Complexity:** Medium–High **Status:** Unexplored
+
+### 4. Skills as git-versioned files compiled from docs/solutions/ — the compounding flywheel
+**Description:** A skill is a directory (`SKILL.md` + optional `tool.ts`) discovered from the DevPod filesystem where Pi already loads `~/.pi/agent/extensions/`. A "skill-compiler" reads existing `docs/solutions/` learnings (already carrying `module`/`tags`/`problem_type` frontmatter) and emits skills — so every documented fix permanently upgrades the one Deuce agent. Match the Claude Code skill manifest shape for ecosystem interop.
+**Axis:** Skills (capability units)
+**Basis:** `external:` + `direct:` — Claude Code models skills as file-based bounded bundles; Pi loads extensions from the filesystem; the repo already ships structured `docs/solutions/` plus recently-committed compound-engineering plans.
+**Rationale:** Distribution/versioning/review come free from git; an open-source project gets a community skill library with zero new infra; solve-once → document → auto-compile → never re-learn.
+**Downsides:** Skill-compiler is real work; manifest-compat is a moving target.
+**Confidence:** 70% **Complexity:** Medium–High **Status:** Unexplored
+
+### 5. Hierarchical skill routing + auto-selection from day one (avoid the 21% shadowing cliff)
+**Description:** Make skill selection automatic and invisible (Deuce routes by message content — the user never re-picks a "role"), and structure skills as category→skill from the start (mirroring the `docs/solutions/` category folders). Optionally enforce conjunctive preconditions per skill so only 1–2 fire per turn.
+**Axis:** Skills / Invocation & UX
+**Basis:** `external:` — flat skill libraries degrade up to 21% from selection failure at scale; the documented fix is hierarchical routing (arXiv:2605.24050).
+**Rationale:** The single-agent move invites skill proliferation; designing the two-tier router before the library grows avoids a painful re-shard and an accuracy regression that would make Deuce feel dumber than the 5-role version.
+**Downsides:** Auto-selection hurts discoverability ("what can Deuce do?"); a routing hop adds latency/cost.
+**Confidence:** 70% **Complexity:** Medium **Status:** Unexplored
+
+### 6. Keep the menu, drop the entities — roles become slash-presets; provenance badges replace role colors
+**Description:** Users may have liked five buttons, not five beings. Reframe roles as invocation presets (`/review`, `/plan`) over one Deuce. Replace the per-agent message border colors (meaningless when one agent speaks) with a provenance chip — "via review skill" / "subagent: tester" — so the visual signal tracks behavior invoked, not identity picked. Also fixes the silent near-miss `@mention` no-op.
+**Axis:** Invocation & UX
+**Basis:** `external:` + `direct:` — UX research (arXiv:2601.18275) found users liked named personas for engagement but suffered attribution confusion; role colors and `@Name→UUID` mention-matching are concrete coupling points today.
+**Rationale:** Ship the collapse while preserving the discoverable-specialization users actually valued.
+**Downsides:** Loses the multi-participant "team in a channel" feel that STRATEGY.md bets on — needs a deliberate product call.
+**Confidence:** 70% **Complexity:** Medium **Status:** Unexplored
+
+### 7. Causal-independence as the explicit "is this overkill?" boundary
+**Description:** The principled answer to the "probably overkill" intuition: collapsing roles is safe exactly when they don't feed each other's output mid-session — and risky (15–40% accuracy drop) when downstream depends on upstream (Planner→Coder→Reviewer chains). Encode it as a per-skill property ("produces feedback consumed mid-session: yes/no") that doubles as the runtime rule for when Deuce runs inline vs. spawns an isolated subagent.
+**Axis:** Migration & harness fit / Subagents
+**Basis:** `external:` — arXiv:2601.04748 causal-independence criterion. In Deuce the human drives the next step, so roles are largely decoupled → collapse is mostly safe.
+**Rationale:** Replaces taste ("feels like overkill") with a measurable boundary; the same formalism that licenses the migration becomes the dispatch policy for skills-vs-subagents.
+**Downsides:** Classifying feedback-dependence per skill is fuzzy; over-applying it re-introduces complexity you're trying to shed.
+**Confidence:** 65% **Complexity:** Medium **Status:** Unexplored
+
+## Rejection Summary
+
+| # | Idea | Reason Rejected |
+|---|------|-----------------|
+| 1 | "Deuce always listening" / no @mentions / ambient presence-trigger | Interesting but a separate product bet (presence model); better as a brainstorm variant than bundled with the collapse |
+| 2 | Delete `agents` table outright / Zero-Agent / implicit singleton | Right endpoint, but Idea 2 (strangler) reaches it more safely; merged |
+| 3 | 1000 Deuces / identity-per-spawn ephemeral children | Captured by Idea 3's subagent model; standalone it's speculative UI churn |
+| 4 | Stem-cell potency, pharmacokinetic half-life, improv "yes-and", orchestra conductor, pin-tumbler, Swiss-army, ED triage | Framing devices — substance folded into Ideas 1/3/5; analogies alone aren't actionable moves |
+| 5 | Auto-derive behavior from repo context (kill system_prompt) | Overlaps Ideas 1/4; "environment is the prompt" is downside-mitigation, not its own move |
+| 6 | skill.json = Claude Code manifest subset | Folded into Idea 4 as the interop sub-point |
+
+All five axes have survivors; no deliberate coverage gaps.
diff --git a/docs/plans/2026-06-08-002-feat-hide-agent-chat-messages-plan.md b/docs/plans/2026-06-08-002-feat-hide-agent-chat-messages-plan.md
new file mode 100644
index 0000000..e86759e
--- /dev/null
+++ b/docs/plans/2026-06-08-002-feat-hide-agent-chat-messages-plan.md
@@ -0,0 +1,131 @@
+---
+title: "feat: Hide agent replies from main chat, surface them only in super thread cards"
+type: feat
+status: completed
+date: 2026-06-08
+depth: lightweight
+---
+
+# feat: Hide agent replies from main chat, surface them only in super thread cards
+
+## Summary
+
+Agent task replies currently render **twice**: as a `MessageBubble` in the main chat list *and* as the reply text in the inline super-thread card (`AgentTaskCard`). Both come from the same `reply` string emitted at task finalization. This plan hides the chat-bubble copy so agent replies live only on the thread surfaces (the inline card and the agent thread drawer), while leaving everything else in chat — human messages and system/operational notices — untouched.
+
+This is a **display-only** change. Agent messages remain persisted (they feed the agent's own conversation-history context) and keep flowing over WebSocket; they are simply filtered out of the rendered chat list.
+
+---
+
+## Problem Frame
+
+When a user @mentions an agent, the agent's final reply is surfaced in two places at once:
+
+1. **Inline super-thread card** — `task.reply` shown in [`AgentTaskCard`](src/components/super-threads/AgentTaskCard.tsx#L100-L125) under the triggering message, and in the drawer `Turn` ([AgentThreadDrawer.tsx:108-138](src/components/super-threads/AgentThreadDrawer.tsx#L108-L138)).
+2. **A full chat `MessageBubble`** — posted as a normal `authorType: "agent"` chat message and rendered in [`ChatView`](src/components/chat/ChatView.tsx#L422-L467).
+
+Both originate from the *same* `reply` string in [`runtime.go:406-419`](server/internal/agent/runtime.go#L406-L419): the `task_completed` event carries it into `task.reply`, and `replyPoster` posts it as the chat message. The duplication clutters the chat and undercuts the super-thread surface as the place to follow agent work.
+
+**Desired behavior:** agent replies do not appear in the chat outside of threads. The last reply is shown in the super-thread card (which it already is). Human messages and system notices stay in chat.
+
+---
+
+## Scope Boundaries
+
+**In scope**
+- Filter real agent replies out of the rendered main chat list in `ChatView`.
+- Keep human messages and system/operational notices (e.g. "Workspace is still starting", "Agent queue is full") visible.
+- Preserve inline `AgentTaskCard` rendering, message-header grouping, empty-state, and auto-scroll behavior.
+
+**Out of scope / non-goals**
+- No change to message persistence, the WebSocket broadcast, or the backend reply-posting path. Agent messages must remain in the DB and the store because `buildChatHistory` ([messages.go](server/internal/handler/messages.go)) uses them for agent context continuity.
+- No change to `AgentTaskCard` / `AgentThreadDrawer` reply rendering — they already show `task.reply`.
+
+### Deferred to Follow-Up Work
+- **Unread-count behavior for hidden agent replies.** `addMessage` ([session-store.ts:202-225](src/stores/session-store.ts#L202-L225)) still bumps the session unread count when a (now hidden) agent reply arrives. This is arguably correct — the card *is* new content — but if the team wants unread to track only visible chat, that's a separate, intentional change.
+
+---
+
+## Key Technical Decisions
+
+**KTD1 — Frontend display filter, not a backend/store change.**
+The agent reply and the card's `task.reply` are the *same string* posted together at finalization, so hiding the chat copy loses no information. Backend or store-level removal would break `buildChatHistory` context continuity and the empty-reply fallback ("(The agent finished without a text response.)"). A render-time filter in `ChatView` is the minimal, reversible change. *(see [runtime.go:406-419](server/internal/agent/runtime.go#L406-L419))*
+
+**KTD2 — Distinguish agent replies from system notices by author identity, not just `authorType`.**
+`postSystemMessage` ([messages.go:297-313](server/internal/handler/messages.go#L297-L313)) posts operational notices with `authorType: "agent"` but a **nil author ID** that matches no session agent. Per the product decision, those notices stay visible. The filter therefore hides a message only when it is `authorType === "agent"` **and** its `authorId` matches a real agent in `session.agents`. Nil-author system notices fall through and remain in chat. This reuses the same author-resolution that `ChatView` already performs.
+
+**KTD3 — Extract the predicate as a pure, tested function.**
+Per the repo convention ([repo.test.ts](src/lib/repo.test.ts) — Vitest-style specs that type-check today and run once a runner is wired), the visibility decision lives in a small pure module with a colocated spec, rather than as an inline `.filter` arrow that can't be tested in isolation.
+
+---
+
+## Implementation Units
+
+### U1. Pure chat-message visibility predicate
+
+**Goal:** A pure, testable function that decides whether a message renders in the main chat list — hiding real agent replies while keeping human messages and system notices.
+
+**Requirements:** Core behavior of the feature; backs KTD2 and KTD3.
+
+**Dependencies:** none.
+
+**Files:**
+- `src/components/chat/message-visibility.ts` (new)
+- `src/components/chat/message-visibility.test.ts` (new)
+
+**Approach:**
+- Export a predicate, e.g. `isVisibleInChat(message, agentIds: Set<string>): boolean`, plus a thin `visibleChatMessages(messages, agentIds)` helper that filters while preserving order.
+- A message is hidden iff `message.authorType === "agent" && agentIds.has(message.authorId)`. Everything else (human messages, and `agent`-typed messages whose author is not a session agent — i.e. nil-author system notices) is visible.
+- `agentIds` is derived by the caller from `session.agents` (a `Set` of agent IDs). Keep the module free of store/session imports so it stays pure.
+
+**Patterns to follow:** Pure selector style of [`src/stores/agent-runs.ts`](src/stores/agent-runs.ts#L198-L227); spec style and the "no runner yet" note from [`src/lib/repo.test.ts`](src/lib/repo.test.ts).
+
+**Test scenarios:**
+- Human message (`authorType: "human"`) → visible, regardless of `agentIds` contents.
+- Agent reply whose `authorId` is in `agentIds` → hidden.
+- System notice (`authorType: "agent"`, `authorId` = nil UUID, not in `agentIds`) → visible.
+- Agent-typed message whose `authorId` is not in `agentIds` (defensive: agent removed from session) → visible.
+- Empty `agentIds` set → no agent messages are hidden (all visible).
+- `visibleChatMessages` on a mixed array → returns only visible messages in original order, drops hidden ones, does not mutate the input.
+
+**Verification:** Spec cases above pass under `vitest` when a runner is present; `npx tsc --noEmit` clean today.
+
+---
+
+### U2. Apply the filter in ChatView
+
+**Goal:** Render the chat list from the filtered message set so agent replies no longer appear as bubbles, while inline task cards, header grouping, empty-state, and scrolling stay correct.
+
+**Requirements:** Delivers the user-visible outcome.
+
+**Dependencies:** U1.
+
+**Files:**
+- `src/components/chat/ChatView.tsx`
+
+**Approach:**
+- Build `agentIds` once from `session.agents` and derive `visibleMessages = visibleChatMessages(sessionMessages, agentIds)`.
+- Drive the message `.map` and the `prevMsg`/`showHeader` grouping ([ChatView.tsx:422-467](src/components/chat/ChatView.tsx#L422-L467)) off `visibleMessages`, so author-change/time-gap headers are computed against the *visible* sequence (no phantom gaps from removed agent bubbles).
+- **Keep `cardsByAnchor` and inline `AgentTaskCard` rendering unchanged** — cards are keyed on the *human* anchor message (`anchorMessageId`), which is never filtered, so cards remain anchored under the @mention that spawned them.
+- Empty-state check ([ChatView.tsx:400](src/components/chat/ChatView.tsx#L400)) uses `visibleMessages.length`.
+- **Auto-scroll:** keep the scroll effect ([ChatView.tsx:335-339](src/components/chat/ChatView.tsx#L335-L339)) triggered on the *raw* `sessionMessages.length` (not `visibleMessages.length`) so an arriving agent reply still scrolls the freshly-updated card into view even though the bubble is hidden.
+
+**Patterns to follow:** existing `sessionMessages` derivation and the author-lookup already in [ChatView.tsx:431-436](src/components/chat/ChatView.tsx#L431-L436).
+
+**Test expectation:** none at the unit level — `ChatView` has no component-test harness. Covered by U1's pure-function tests plus the manual verification below.
+
+**Verification (manual / visual):**
+- @mention an agent → the agent's reply no longer appears as a chat bubble; the inline super-thread card under the @mention shows the reply text in its done state.
+- A system notice path (e.g. send an agent request while the workspace is starting) → the "Workspace is still starting" notice **still appears** in chat.
+- Consecutive human messages still collapse headers correctly (no stray avatars/timestamps from removed agent bubbles).
+- A session whose only agent output is hidden still renders its human messages and cards; empty-state shows only when there are genuinely no visible messages.
+- A drawer-initiated (anchorless) task still shows its reply in the agent thread drawer `Turn` — this is the intended "inside the thread" surface for tasks with no inline card.
+- `npm run lint` and `npx tsc --noEmit` clean.
+
+---
+
+## Verification Strategy
+
+1. `npx tsc --noEmit` — type safety across the new module and `ChatView` edits.
+2. `npm run lint`.
+3. U1 spec cases (run under vitest if/when the runner is wired; type-checked now).
+4. Manual walkthrough of the U2 verification scenarios against a running dev server (`npm run dev` + backend `make dev`).
diff --git a/docs/solutions/architecture-patterns/pi-loads-agent-skills-standard-in-rpc-mode.md b/docs/solutions/architecture-patterns/pi-loads-agent-skills-standard-in-rpc-mode.md
new file mode 100644
index 0000000..896a1ce
--- /dev/null
+++ b/docs/solutions/architecture-patterns/pi-loads-agent-skills-standard-in-rpc-mode.md
@@ -0,0 +1,79 @@
+---
+title: Pi loads Agent-Skills-standard skills in --mode rpc; provision them to ~/.pi/agent/skills and invoke via /skill: prompt expansion
+date: 2026-06-08
+category: architecture-patterns
+module: agent/pirun
+problem_type: architecture_pattern
+component: agent-harness
+severity: medium
+applies_when:
+  - "You want one generic Pi agent whose behavior is modulated by skills/subagents instead of multiple role-specific agent rows"
+  - "You are deciding whether the compound-engineering plugin (or any Claude Code / Agent Skills bundle) can run on the Pi harness Deuce drives in --mode rpc"
+  - "You need to know how to provision and invoke skills inside a per-session DevPod container over the existing devpod ssh channel"
+related_components:
+  - agent-harness
+  - devpod
+  - workspace-provisioning
+tags:
+  - pi
+  - pi-rpc
+  - agent-skills
+  - skills
+  - subagents
+  - compound-engineering
+  - claude-code-compat
+  - append-system-prompt
+---
+
+# Pi loads Agent-Skills-standard skills in `--mode rpc`
+
+## Context
+
+Deuce models AI helpers as five role rows in an `agents` table (Coder/Reviewer/Planner/Tester/Designer), but under the Pi harness `role` and `system_prompt` are fetched and never forwarded — Pi runs as one generic coding agent regardless. The open question was whether collapsing to a single "Deuce" agent **modulated by skills and subagents** is viable on the Pi harness Deuce already drives (`pi --mode rpc` launched in each session's DevPod container — see [devpod_launcher.go](../../../server/internal/agent/pirun/devpod_launcher.go)).
+
+The compound-engineering plugin was exported to Pi's agent-config layout (in `server/tmp/agent/`: `skills/<name>/SKILL.md`, `agents/<name>.md`, `AGENTS.md`, `install-manifest.json`). This documents what a spike against the real `pi` 0.74.2 binary established about whether that bundle works in RPC mode.
+
+## Findings (verified)
+
+1. **Pi implements the [Agent Skills standard](https://agentskills.io/specification)** — the same standard Claude Code uses. A skill is a directory with a `SKILL.md` (YAML frontmatter `name` + `description`, then markdown instructions). Bundles authored for Claude Code drop in unmodified.
+
+2. **Skills auto-discover from `~/.pi/agent/skills/`** (Pi's documented global location), which is the exact sibling of `~/.pi/agent/extensions/` where Deuce already provisions `ask-user.ts` via `InstallPiExtension` ([manager.go](../../../server/internal/workspace/manager.go)). Pi can also be pointed at `~/.claude/skills` via the `skills` settings array, or given explicit `--skill <path>` (additive even with `--no-skills`). No launcher-flag change is required to pick up the global dir.
+
+3. **`--mode rpc` is an output format, orthogonal to skill loading.** Skills load identically in text and rpc modes. Empirically: driving `pi --mode rpc --skill <bundle>/skills` and sending `{"type":"get_commands"}` returned all 36 exported skills as `skill:<name>` commands (`source: "skill"`), each invocable as `/skill:<name>`. This needs no API key — loading happens at startup before any model call.
+
+4. **Two invocation paths, one deterministic:**
+   - *Model-driven:* at startup Pi injects each skill's name+description into the system prompt (progressive disclosure per the spec); on a matching task the model uses `read` to load the full `SKILL.md`. Models "don't always do this."
+   - *Deterministic (preferred for Deuce):* send `{"type":"prompt","message":"/skill:ce-commit ..."}` — **skill commands are expanded before the prompt is processed** (Pi rpc.md). Arguments after the command are appended as `User: <args>`. This does not depend on the model choosing to bite.
+   - `get_commands` (not `get_state`) enumerates available skills/prompts/extension-commands — use it to populate UI or validate `@mention`→skill mapping. `get_state` does **not** list skills.
+
+5. **`--system-prompt` / `--append-system-prompt` are real launch flags.** A single "Deuce" identity prompt can be set at process launch. (There is still no *mid-session per-task* system-prompt override via an RPC command — per-task specialization rides `/skill:` expansion instead, which is cleaner than prepending prompt text.)
+
+## Not yet verified
+
+- **Live model triggering** of the model-driven path needs an API key (absent in the spike env). The deterministic `/skill:` path sidesteps this.
+- **Subagents.** The export's `agents/*.md` personas are consumed by the **`pi-subagents`** npm extension (provides the `subagent` tool), *not* Pi core — `AGENTS.md` lists it as required (`pi install npm:pi-subagents`), with `pi-ask-user` recommended (the published equivalent of Deuce's hand-rolled `ask-user.ts`). A `pi-subagents`-spawned subagent forks a child Pi instance **inside the container**, bypassing Deuce's `pirun.Key{SessionID, AgentID}` supervisor, serial queue, and WS `agentId` attribution. Whether subagents run under Pi (cheap, invisible to Deuce) or are orchestrated by Deuce (attributable, more work) is an open design decision.
+
+## Integration path for Deuce
+
+1. **Provision skills:** generalize `InstallPiExtension` ([manager.go](../../../server/internal/workspace/manager.go)) to push the `skills/` tree (and later `agents/`) into `~/.pi/agent/skills/` — same `mkdir -p` + base64-over-`devpod ssh` mechanism, a tar instead of a single file. Wire it into `provisionAgentTools` ([workspace.go](../../../server/internal/handler/workspace.go)) beside the ask-user install. No launcher change (auto-discovery).
+2. **Invoke:** map `@Coder` → a `prompt` command carrying `/skill:coder`; enumerate with `get_commands` to validate.
+3. **Identity:** add `--append-system-prompt` to the launch command ([devpod_launcher.go](../../../server/internal/agent/pirun/devpod_launcher.go)) for the single Deuce persona.
+
+## Reproduce
+
+`pi` 0.74.2; bundle at `server/tmp/agent`; harness `server/tmp/spike-pi-skills.sh`:
+
+```bash
+printf '%s\n' '{"type":"get_commands"}' \
+  | pi --mode rpc --skill server/tmp/agent/skills --offline --no-session \
+  | grep -o '"source":"skill"' | wc -l      # => 36
+```
+
+Pi's own docs are vendored with the npm package at
+`@earendil-works/pi-coding-agent/docs/{skills,rpc,extensions}.md` — authoritative reference for discovery locations and the RPC command surface.
+
+## Related
+
+- [[broadening-resource-visibility-requires-per-route-authorization-audit]] — any new "invoke skill" / "spawn subagent" route must carry an explicit auth gate.
+- [[embedded-ssh-proxy-for-vscode-remote]] — same host→container provisioning topology; "one identity model, one user table" precedent for attributing Deuce + subagents.
+- Ideation: `docs/ideation/2026-06-08-single-deuce-agent-ideation.md`.