diff --git a/docs/brainstorms/2026-06-08-interactive-agent-questions-requirements.md b/docs/brainstorms/2026-06-08-interactive-agent-questions-requirements.md
new file mode 100644
index 0000000..e3d86de
--- /dev/null
+++ b/docs/brainstorms/2026-06-08-interactive-agent-questions-requirements.md
@@ -0,0 +1,201 @@
+---
+date: 2026-06-08
+topic: interactive-agent-questions
+---
+
+# Interactive Agent Questions
+
+## Summary
+
+When an agent needs input mid-task, render its question as an interactive
+inline prompt in the agent thread — a text field for open questions,
+selectable buttons when the agent offers choices, with a free-text fallback —
+and guarantee that a question is never shown to the user as raw JSON,
+regardless of which path produced it.
+
+## Problem Frame
+
+Today an agent question can reach the user as a literal tool-call string in
+the thread, e.g.:
+
+```
+Ask_user({"question":"What file would you like me to create? Please provide:\n1. The filename...\n2. The content..."})
+```
+
+This is the structured question path *failing*. The intended pipeline is
+`extension_ui_request` → `KindAwaitingInput` → `task.pendingQuestion`, which
+the Super Thread drawer/card already consumes. The JSON appears when that
+structured event never fires — the model emits the call as assistant prose
+instead. Known triggers: the Pi `ask_user` extension install silently failed
+(the installer is non-fatal and returns `nil` on error), the legacy `claude`
+harness has no `ask_user` tool at all, or the model narrates the call in text
+even when the tool exists.
+
+Two distinct quality gaps stack here:
+
+1. **Even on the happy path, the question renders as plain text.** The drawer
+ and task card print `task.pendingQuestion` verbatim — there is no input
+ affordance attached to it, and no way for the agent to offer concrete
+ choices.
+2. **On the failure path, the user sees raw JSON** — which reads as a broken
+ product, not a question.
+
+Other harnesses (including Claude Code's own question UI) present questions as
+typed, answerable controls. Deuce should meet that bar and additionally
+guarantee the JSON failure mode can't surface.
+
+## Key Decisions
+
+- **Typed prompts, not just clean text.** The agent can attach a question
+ *kind* (free-text / pick-one / confirm) and, for choice kinds, a set of
+ options. The thread renders the matching control: a text field, selectable
+ buttons, or a yes/no affirmation. This is a deliberate step beyond "strip the
+ JSON and show prose" — chosen because concrete choices lower answer friction
+ and let agents ask better questions.
+
+- **A two-layer no-JSON guarantee.** Layer one hardens the structured path so a
+ question reliably arrives as a structured event (install failures become
+ loud/recoverable rather than silently dropping the tool). Layer two is a
+ text-shaped-question backstop: when an agent message *looks* like a tool call
+ (`ask_user(...)` / `Ask_user({...})`), it is intercepted, the question is
+ extracted, and it renders through the same prompt widget instead of as JSON.
+
+- **The guaranteed floor is "never raw JSON," not "always a rich prompt."**
+ When only text leaks and the structured event never fired, the backstop can
+ reconstruct a *clean question prompt* but not choices the agent never emitted
+ structurally. Degrading to a free-text prompt on those paths is acceptable;
+ showing JSON is not.
+
+- **Rich choices are Pi-path only.** The legacy `claude` harness has no
+ extension mechanism, so it cannot emit structured options. The backstop keeps
+ it from leaking JSON, but its questions stay free-text.
+
+- **Questions stay in the agent thread, not the main chat timeline.** This
+ matches the existing `awaiting_input` routing; the prompt is modal to the
+ task, answered in place.
+
+## Actors
+
+- A1. **Agent** — running inside a session's DevPod, blocks on a question when
+ it needs a decision only the human can make.
+- A2. **Human collaborator** — sees the prompt in the agent thread and answers
+ it in place; their answer resumes the agent.
+
+## Key Flows
+
+- F1. **Structured typed question (happy path).**
+ - **Trigger:** the agent calls `ask_user` with a question and (optionally) a
+ kind and choices.
+ - The task enters `awaiting_input`; the thread renders the matching control
+ inline.
+ - The human answers in place (types, picks a button, confirms, or uses the
+ free-text fallback on a choice question).
+ - The answer is delivered back to the agent and the task resumes.
+
+- F2. **Leaked question (backstop path).**
+ - **Trigger:** a question reaches the client as assistant text shaped like a
+ tool call (legacy harness, failed extension, or model narration).
+ - The text is detected, the question string extracted, and a free-text prompt
+ rendered in place of the raw JSON.
+ - The human answers; the answer is routed back through the normal reply path.
+ - **Floor:** if extraction fails, the surfaced content is still a readable
+ question, never a JSON blob.
+
+## Requirements
+
+**Prompt rendering & interaction**
+
+- R1. An agent question in `awaiting_input` renders as an interactive prompt in
+ the agent thread, with an affordance to answer it in place (no copy-pasting,
+ no separate composer hunt).
+- R2. For a free-text question, the prompt presents a text input and a send
+ action.
+- R3. For a pick-one question, the prompt presents the agent's options as
+ selectable buttons, plus a free-text "Other" fallback so the human is never
+ trapped by the offered set.
+- R4. For a confirm question, the prompt presents an affirm/decline control.
+- R5. Selecting or submitting an answer delivers it back to the waiting agent
+ and transitions the task out of `awaiting_input`.
+
+**Question data model**
+
+- R6. The `ask_user` capability supports an optional question *kind* (free-text
+ / pick-one / confirm) and, for choice kinds, an optional list of options.
+- R7. A question with no kind/options behaves as free-text — the change is
+ additive and backward-compatible with the current question-only shape.
+- R8. The kind and options survive end to end (agent → structured event →
+ thread render) without being flattened back into a text string.
+
+**Leak prevention (the no-JSON guarantee)**
+
+- R9. A question is never displayed to the user as raw JSON or as a literal
+ tool-call string, on any path.
+- R10. The structured-path failure modes that currently cause leaks are
+ hardened: a failed `ask_user` extension install is surfaced (loud /
+ recoverable), not silently dropped such that the agent narrates the call as
+ text.
+- R11. A backstop detects agent text shaped like an `ask_user` tool call,
+ extracts the question, and renders it as a free-text prompt (R2) instead of
+ the raw string.
+- R12. When the backstop cannot parse the leaked text into a question, the
+ surfaced content is still readable prose, never a JSON blob.
+
+## Acceptance Examples
+
+- AE1. **Covers R3.** Agent asks "Which framework?" with options
+ `[React, Vue, Svelte]`. The thread shows three buttons plus an "Other" field.
+ Picking "Vue" resumes the agent with "Vue"; typing "Solid" in Other resumes
+ it with "Solid".
+- AE2. **Covers R2, R7.** Agent asks a question with no kind or options. The
+ thread shows a single text field — identical to today's question semantics,
+ now interactive.
+- AE3. **Covers R9, R11.** A legacy-harness run emits
+ `Ask_user({"question":"What file should I create?"})` as text. The user sees
+ a free-text prompt asking "What file should I create?", not the JSON.
+- AE4. **Covers R12.** A malformed leak (truncated/garbled tool-call text) that
+ can't be parsed surfaces as readable text, not a JSON fragment.
+- AE5. **Covers R4.** Agent asks a confirm-kind question ("Proceed with the
+ force-push?"). The thread shows affirm/decline; declining resumes the agent
+ with a negative answer.
+
+## Scope Boundaries
+
+**Deferred for later**
+
+- Multi-question batches (asking several questions in one prompt) — the model
+ stays one question per `awaiting_input`.
+- Surfacing prompts in the main chat timeline as a distinct message type — they
+ remain scoped to the agent thread.
+
+**Outside this product's identity**
+
+- Full typed-prompt support inside the legacy `claude` harness — it has no
+ extension channel to carry structured choices. Its only guarantee is the
+ no-JSON backstop with free-text prompts. (If/when the legacy harness is
+ retired, this boundary dissolves.)
+
+## Dependencies / Assumptions
+
+- The Pi `ask_user` extension is the carrier for structured kind/options; this
+ assumes Pi's extension UI channel can convey option sets back through the
+ `extension_ui_request` event (the decoder already carries a `RequestKind`).
+- Assumes `ctx.hasUI` is true in Deuce's Pi RPC mode (the extension's happy
+ path calls `ctx.ui.input` directly); the headless `hasUI === false` branch
+ remains the correct behavior for genuinely non-interactive contexts.
+- Assumes the agent-thread drawer/card is the right and only surface for the
+ prompt (consistent with current `awaiting_input` routing).
+
+## Outstanding Questions
+
+**Deferred to planning**
+
+- Exact shape of the options payload through Pi's extension UI channel
+ (whether `ctx.ui` exposes a select/confirm primitive or whether options ride
+ inside the input request) — confirm against Pi's extension API during
+ planning.
+- How "loud / recoverable" extension-install failure should manifest
+ operationally (retry, surfaced session warning, or agent-disable) — a
+ reliability design choice for planning.
+- The precise detection boundary for the text backstop (which patterns count as
+ a leaked `ask_user` call without false-positiving on legitimate prose that
+ mentions the tool).
diff --git a/docs/plans/2026-06-08-001-feat-interactive-agent-questions-plan.md b/docs/plans/2026-06-08-001-feat-interactive-agent-questions-plan.md
new file mode 100644
index 0000000..17047f2
--- /dev/null
+++ b/docs/plans/2026-06-08-001-feat-interactive-agent-questions-plan.md
@@ -0,0 +1,495 @@
+---
+title: "feat: Interactive typed agent questions with no-JSON guarantee"
+type: feat
+status: completed
+date: 2026-06-08
+origin: docs/brainstorms/2026-06-08-interactive-agent-questions-requirements.md
+---
+
+# feat: Interactive typed agent questions with no-JSON guarantee
+
+## Summary
+
+Render agent questions as interactive typed prompts in the agent thread — a
+text field for open questions, selectable buttons when the agent offers
+choices, a confirm control for yes/no, each with a free-text "Other" fallback —
+and guarantee a question is never shown as raw JSON. Scoped entirely to the Pi
+harness; the legacy `claude` executor is being removed and gets no work here.
+
+---
+
+## Problem Frame
+
+Today an agent question can reach the user as a literal tool-call string in the
+thread, e.g. `Ask_user({"question":"What file would you like me to create?"})`.
+That is the structured question path *failing*: the intended pipeline is
+`extension_ui_request` → `KindAwaitingInput` → `task.pendingQuestion`, which the
+Super Thread drawer/card already consume. The JSON appears when that structured
+event never fires — the Pi `ask_user` extension install silently failed
+(`InstallPiExtension` returns `nil` on error), or the model narrated the call as
+assistant prose (which streams into the reply buffer and posts as a chat
+message).
+
+Two quality gaps stack: (1) even on the happy path the question renders as plain
+text with no input affordance and no way for the agent to offer choices
+(`AgentTaskCard.tsx` / `AgentThreadDrawer.tsx` print `task.pendingQuestion`
+verbatim), and (2) on the failure path the user sees raw JSON, which reads as a
+broken product. See origin: `docs/brainstorms/2026-06-08-interactive-agent-questions-requirements.md`.
+
+---
+
+## Requirements Traceability
+
+Carried from the origin requirements doc (R1–R12, A1–A2, F1–F2, AE1–AE5):
+
+- **Typed prompts (R1–R5, R6–R8):** interactive prompt per kind (free-text /
+ pick-one / confirm) with "Other" fallback; `ask_user` extended additively to
+ carry `kind` + `options`; kind/options survive end to end without flattening.
+- **No-JSON guarantee (R9–R12):** never display raw JSON; harden the silent
+ install failure (R10); backstop narrated tool-call text into a clean question
+ (R11) with a readable-prose floor when unparseable (R12).
+- **Actors:** A1 agent (asks), A2 human (answers in place).
+- **Flows:** F1 structured typed question; F2 leaked question → backstop.
+
+**Origin note — AE3 reframed.** The origin's AE3 described a *legacy-harness*
+leak. Since the `claude` harness is being removed, AE3 is covered here as a
+**Pi-path narration leak** (the model writes the call into its assistant text
+instead of invoking the tool). The "Outside this product's identity" origin
+boundary about legacy-harness typed prompts is moot — that harness is going
+away, not being supported at a lower tier.
+
+---
+
+## Key Technical Decisions
+
+- **Plumb `kind` + `options` through the existing `pendingQuestion` channel, not
+ a new event.** `task_awaiting_input` is part of the append-only, seq-ordered
+ AgentRunEvent family (KTD6) and the decoder already extracts `RequestKind`
+ (`decoder.go` `Event.RequestKind`). The gap is that `ws.TaskEventPayload` drops
+ it. Add `pendingQuestionKind` + `pendingQuestionOptions` fields mirroring
+ `pendingQuestion` exactly, additive and backward-compatible (a question with no
+ kind behaves as free-text, R7).
+
+- **Answers keep flowing through the existing steer → `ExtensionUIResponse`
+ path.** Clicking a choice or confirm sends the chosen value as the steer
+ message; `RouteOrEnqueue` already routes it to Pi as
+ `ExtensionUIResponse{id, response}` keyed by the tracked request id (KTD15),
+ under the per-`(session,agent)` lock (KTD9), clearing the awaiting-input
+ ceiling (KTD8). No new answer transport. `ExtensionUIResponse.Response` is
+ already `any`, so a string value needs no protocol change.
+
+- **The backstop lives in the Pi reply-finalize path, not a shared post path.**
+ With the legacy harness removed, the only remaining leak source is Pi-path
+ assistant-text narration, which accumulates via `appendReply` and is taken at
+ `takeReply` before `replyPoster` fires in `finalizeLocked`. Intercepting there
+ sanitizes a tool-call-shaped reply into a clean question before it ever posts.
+ Floor is clean readable text (the user answers via the normal composer), not a
+ synthesized interactive prompt — a narrated question means the agent is not
+ actually blocked on a structured request.
+
+- **Install-failure hardening is loud, not self-healing.** `InstallPiExtension`
+ escalates a failure to error-level logging and surfaces a session-visible
+ notice that the agent cannot ask questions, rather than retrying or disabling
+ the agent. Preserve the deliberate base64-over-the-wire encoding.
+
+- **Rich choices depend on verifying the live Pi `ctx.ui` API.** The
+ `@earendil-works/pi-coding-agent` types are not vendored locally, so whether
+ `ctx.ui.select` / `ctx.ui.confirm` exist is unverified. The extension verifies
+ at implementation time and falls back to `ctx.ui.input` with options rendered
+ into the prompt text when the richer primitives are absent — still no JSON,
+ still answerable. Keep the `ctx.hasUI === false` headless branch intact.
+
+---
+
+## High-Level Technical Design
+
+Two layers, one shared prompt surface. The happy path (F1) carries structured
+kind/options end to end; the backstop (F2) sanitizes a narrated leak before it
+reaches chat.
+
+```mermaid
+flowchart TD
+ subgraph Pi["Pi container"]
+ EXT["ask_user extension kind + options (U1)"]
+ end
+ EXT -->|extension_ui_request| DEC["decoder: RequestKind + options (U2)"]
+ DEC --> RT["runtime: SetAwaitingInput + setPending"]
+ RT -->|task_awaiting_input +kind +options| WS["ws.TaskEventPayload (U2)"]
+ WS --> RED["agent-runs reducer +kind +options (U3)"]
+ RED --> UI["drawer / card typed controls (U4)"]
+ UI -->|steer: chosen value| ROE["RouteOrEnqueue"]
+ ROE -->|extension_ui_response{id,response}| EXT
+
+ subgraph Leak["F2 — narration leak (Pi only)"]
+ TXT["assistant text narrates Ask_user(...)"] --> AR["appendReply → takeReply"]
+ AR --> BS{"backstop: tool-call-shaped? (U6)"}
+ BS -->|yes, parseable| CLEAN["post clean question text"]
+ BS -->|yes, unparseable| PROSE["post readable prose (floor)"]
+ BS -->|no| NORMAL["post reply unchanged"]
+ end
+
+ INST["InstallPiExtension fails → loud notice (U5)"] -.prevents.-> TXT
+```
+
+Diagram is authoritative alongside the prose below.
+
+---
+
+## Implementation Units
+
+### U1. Extend the `ask_user` Pi extension to carry kind + options
+
+**Goal:** Let the agent attach a question kind (free-text / pick-one / confirm)
+and, for choice kinds, a list of options — rendering the matching Pi UI
+primitive, with a graceful fallback when the richer primitive is unavailable.
+
+**Requirements:** R6, R7, R8. Advances F1.
+
+**Dependencies:** none.
+
+**Files:**
+- `server/internal/agent/pirun/extension/ask-user.ts` (modify)
+- (embed is automatic via `server/internal/agent/pirun/extension/embed.go` — no change unless a new file is added)
+
+**Approach:**
+- Add optional `kind` (`"input" | "select" | "confirm"`, default `input`) and
+ `options: string[]` parameters to the tool's typebox schema. Keep `question`
+ required. Additive — omitting kind/options preserves today's free-text
+ behavior (R7).
+- Dispatch on kind: `select` → `ctx.ui.select` (or equivalent) with options;
+ `confirm` → `ctx.ui.confirm`; `input`/default → existing `ctx.ui.input`.
+- **Verify the real `ctx.ui` surface against the live Pi API first** (see
+ Verification). If `select`/`confirm` are not exposed, fall back to
+ `ctx.ui.input` with the options enumerated in the prompt string and return the
+ raw typed answer — never emit JSON.
+- Return the chosen option / confirm result as `content: [{type:"text", text}]`,
+ matching the current contract.
+- Preserve the `!ctx.hasUI` headless branch unchanged.
+
+**Patterns to follow:** the current `registerTool` + typebox shape in the same
+file; the existing `ctx.ui.input` call and text-content return.
+
+**Test scenarios:**
+- Covers AE2. `ask_user` with no kind/options → behaves as free-text `input`,
+ identical request shape to today.
+- Covers AE1. `ask_user` with `kind:"select"`, options `[React, Vue, Svelte]` →
+ emits a select-style request carrying all three options.
+- Covers AE5. `ask_user` with `kind:"confirm"` → emits a confirm-style request.
+- Fallback: when the select/confirm primitive is unavailable, the tool still
+ returns a typed answer and never returns a JSON blob.
+- Headless: `ctx.hasUI === false` returns the "proceed on best judgment" text
+ for every kind.
+
+**Execution note:** Verify the Pi `ctx.ui` API shape before committing the
+dispatch — the primitive names are unverified locally.
+
+**Verification:** Run an agent in a real session; confirm a `select` question
+renders options and a `confirm` question renders yes/no, and that the answer
+returns to the agent as text. If primitives are missing, confirm the
+input-fallback path produces a clean (non-JSON) prompt.
+
+---
+
+### U2. Propagate kind + options through the Go event pipeline
+
+**Goal:** Carry the question kind and options from the decoded
+`extension_ui_request` through the runtime to the `task_awaiting_input` WS
+payload, mirroring the existing `pendingQuestion` plumbing.
+
+**Requirements:** R8. Advances F1, R3, R4.
+
+**Dependencies:** U1 (the extension must emit kind/options to decode).
+
+**Files:**
+- `server/internal/agent/pirun/decoder.go` (modify — extract options alongside the existing `RequestKind`)
+- `server/internal/ws/events.go` (modify — add `PendingQuestionKind`, `PendingQuestionOptions` to `TaskEventPayload`)
+- `server/internal/agent/runtime.go` (modify — pass kind/options into the `TypeTaskAwaitingInput` broadcast)
+- `server/internal/agent/pirun/decoder_test.go` (modify/add)
+
+**Approach:**
+- Extend `decodeUIRequest` to pull an options list from the request params
+ (best-effort per KTD2 — tolerate absence; normalize the unverified field names
+ during U1 verification). `Event` already carries `RequestKind`; add an
+ `Options []string` field.
+- Add `PendingQuestionKind string` and `PendingQuestionOptions []string`
+ (both `omitempty`) to `ws.TaskEventPayload`.
+- In `runtime.go` `translate` (the `KindAwaitingInput` case), include
+ `ev.RequestKind` and the options in the existing broadcast. No change to
+ `SetAwaitingInput` persistence semantics or the pending-request id tracking.
+
+**Patterns to follow:** the existing `PendingQuestion: ev.Prompt` flow in the
+`TypeTaskAwaitingInput` broadcast; `omitempty` JSON tags on `TaskEventPayload`.
+
+**Test scenarios:**
+- Decoder: an `extension_ui_request` with kind `select` + options decodes to
+ `Event{RequestKind:"select", Options:[...]}`.
+- Decoder (KTD2 tolerance): a request missing kind/options decodes to free-text
+ with empty options, no error.
+- Decoder: malformed/extra fields are tolerated, stream continues.
+- Payload: marshaled `TaskEventPayload` includes camelCase
+ `pendingQuestionKind` / `pendingQuestionOptions` only when present (omitempty).
+
+**Verification:** Unit tests pass; a live `select` question produces a
+`task_awaiting_input` WS frame carrying kind + options.
+
+---
+
+### U3. Carry kind + options in the frontend store and types
+
+**Goal:** Thread the new fields into the client `TaskEventPayload`/`AgentTask`
+types and the `agent-runs` reducer so they survive both the live-event and
+snapshot-replace paths.
+
+**Requirements:** R8. Advances R3, R4.
+
+**Dependencies:** U2 (wire fields exist).
+
+**Files:**
+- `src/types/index.ts` (modify — `TaskEventPayload` + `AgentTask`)
+- `src/stores/agent-runs.ts` (modify — `task_awaiting_input` reduction; keep `AGENT_RUN_EVENT_TYPES` in sync if touched)
+- `src/stores/agent-runs.test.ts` (new)
+- `package.json` / `vitest.config.ts` (new — establish a vitest runner)
+
+**Approach:**
+- Add `pendingQuestionKind?: "input" | "select" | "confirm"` and
+ `pendingQuestionOptions?: string[]` to the client `TaskEventPayload` and
+ `AgentTask`.
+- In the `task_awaiting_input` reducer case, set the new fields next to
+ `pendingQuestion`. Ensure the snapshot-apply path (`applySnapshot`) carries
+ them too, so a snapshot refetch doesn't clobber them (per the residual-findings
+ flicker window).
+- Establish a vitest runner — the reducer is currently untested and no frontend
+ test runner exists. This is a prerequisite for testing the feature-bearing
+ reducer change, not adjacent cleanup.
+
+**Patterns to follow:** the existing `pendingQuestion` field on both types and
+its reducer assignment; the existing snapshot vs event reconcile logic.
+
+**Test scenarios:**
+- Covers AE1. `task_awaiting_input` with kind `select` + options reduces to an
+ `AgentTask` carrying both.
+- Covers AE2. `task_awaiting_input` with no kind reduces to free-text (kind
+ undefined), backward-compatible.
+- Snapshot path: applying a snapshot that contains an awaiting-input task
+ preserves kind/options (no clobber after a seq-gap refetch).
+- Event ordering: a later `task_started` for the same task clears the pending
+ fields as today.
+
+**Execution note:** Establish the vitest runner first, then add the reducer
+test (test-first for the new field reduction).
+
+**Verification:** `npx vitest run` passes; `npx tsc --noEmit` clean.
+
+---
+
+### U4. Render typed prompt controls and wire answers back
+
+**Goal:** Replace the plain-text `pendingQuestion` rendering with interactive
+controls — text field, choice buttons, or confirm — each with an "Other"
+free-text fallback, answered in place and routed through the existing steer
+path.
+
+**Requirements:** R1, R2, R3, R4, R5. Advances F1; A1, A2.
+
+**Dependencies:** U3 (store carries kind/options).
+
+**Files:**
+- `src/components/super-threads/AgentThreadDrawer.tsx` (modify — render controls; reuse the existing composer as the "Other" / free-text input)
+- `src/components/super-threads/AgentTaskCard.tsx` (modify — reflect kind in the inline summary)
+- `src/components/super-threads/ThreadDrawerPanel.tsx` (modify if the answer wiring needs the chosen value)
+- relevant CSS (e.g. the `q-pending-q` / `q-drawer-composer` styles)
+- `src/components/super-threads/AgentThreadDrawer.test.tsx` (new)
+
+**Approach:**
+- Branch the awaiting-input render on `pendingQuestionKind`:
+ - `input`/undefined → existing text field + send (R2).
+ - `select` → a button per option (R3); selecting one sends that option's text
+ via `steer` (the existing `onSend` → `steer` → `sendSteer` path). Keep the
+ composer visible as the "Other" fallback (R3).
+ - `confirm` → affirm / decline controls (R4); each sends its value via the
+ same path.
+- On submit/selection, reuse `steer(sessionId, agentId, value)` — no new
+ transport. The backend `RouteOrEnqueue` already turns it into
+ `ExtensionUIResponse` when the task is `awaiting_input` (R5), and the answer is
+ also posted as the human's chat message as today.
+- Confirm the exact value Pi expects for select/confirm answers during U1
+ verification (option text vs index; "yes"/"no" vs boolean) and send that.
+
+**Patterns to follow:** the existing composer (`val`/`send()`/`onSend`) in
+`AgentThreadDrawer.tsx`; the `awaiting_input` render branch in both components;
+`AgentAvatar` + agent color usage.
+
+**Test scenarios:**
+- Covers AE1. A `select` task renders three option buttons plus a free-text
+ field; clicking "Vue" calls `steer` with "Vue"; typing "Solid" in Other calls
+ `steer` with "Solid".
+- Covers AE2. A free-text task renders a single text field; submitting calls
+ `steer` with the typed value.
+- Covers AE5. A `confirm` task renders affirm/decline; declining calls `steer`
+ with the decline value.
+- Empty/whitespace input is not sendable (send disabled), matching today.
+- The card summary reflects the question across kinds without rendering JSON.
+
+**Verification:** Manual run of a real `select`/`confirm`/free-text question in
+a session — each renders the right control, answering resumes the agent, and the
+awaiting-input state clears.
+
+---
+
+### U5. Harden the silent extension-install failure
+
+**Goal:** Make a failed `ask_user` extension install loud and visible instead of
+silently leaving the agent unable to ask, which is what lets a narrated leak
+happen.
+
+**Requirements:** R10. Advances R9.
+
+**Dependencies:** none.
+
+**Files:**
+- `server/internal/workspace/manager.go` (`InstallPiExtension`, ~lines 592–616)
+- `server/internal/handler/workspace.go` and/or `server/internal/handler/sessions.go` (the `provisionAgentTools` call sites — surface the failure to the session)
+- relevant manager/handler test file
+
+**Approach:**
+- Escalate the install failure from `slog.Warn` to `slog.Error` and return/
+ propagate a signal so the caller can surface it (rather than swallowing to
+ `nil`). Keep provisioning non-fatal for the *workspace* (the session still
+ comes up) but no longer silent.
+- Surface a session-visible notice that the agent cannot ask questions
+ (mechanism: a system/agent chat message or an existing notice channel — choose
+ the lightest existing surface during implementation).
+- Preserve the base64-over-the-wire install command (shell-quoting safety).
+
+**Patterns to follow:** the existing `logFn` warning in `InstallPiExtension`; how
+other provisioning failures are surfaced, if any; the residual-findings note
+about a shared `InstallPi`/`InstallTools` installer (coordinate but do not
+expand scope into that refactor).
+
+**Test scenarios:**
+- Install exec failure logs at error level and returns a non-nil signal (not
+ swallowed).
+- Workspace/session still reaches ready state on install failure (non-fatal
+ preserved).
+- The session receives a visible notice when the install fails.
+- Success path is unchanged and emits no spurious notice.
+
+**Verification:** Simulate an install failure (e.g. force the exec to fail) and
+confirm the error log + session notice appear and the agent does not silently
+narrate questions.
+
+---
+
+### U6. Backstop: sanitize narrated tool-call text before it posts
+
+**Goal:** Detect a reply that is shaped like an `ask_user` tool call and turn it
+into a clean question (or, if unparseable, readable prose) before it reaches
+chat — so a narrated question is never shown as raw JSON.
+
+**Requirements:** R9, R11, R12. Advances F2.
+
+**Dependencies:** none (independent of U1–U4).
+
+**Files:**
+- `server/internal/agent/runtime.go` (intercept in `finalizeLocked` between `takeReply` and `replyPoster`)
+- a small detector/sanitizer helper (new file under `server/internal/agent/`, e.g. `agent/question_backstop.go`)
+- corresponding `_test.go`
+
+**Approach:**
+- Add a detector that recognizes a reply whose content is dominated by an
+ `ask_user(...)` / `Ask_user({...})` tool-call shape (case-insensitive,
+ tolerant of whitespace/escaping).
+- When matched and the `question` value is extractable, replace the posted reply
+ with the clean question text (R11). The user answers via the normal composer;
+ this is the clean-text floor, not a synthesized interactive prompt.
+- When matched but unparseable, strip to readable prose — never post a JSON
+ fragment (R12).
+- When not matched, post the reply unchanged.
+- Scope the pattern narrowly to avoid false positives on prose that merely
+ mentions `ask_user` (e.g. require the call-shape to be the substantive content
+ of the reply, not an inline mention).
+
+**Patterns to follow:** the `takeReply` → `replyPoster` call sequence in
+`finalizeLocked`; existing helper/test layout in `server/internal/agent/`.
+
+**Test scenarios:**
+- Covers AE3 (reframed). A reply of
+ `Ask_user({"question":"What file should I create?"})` posts as the clean
+ question "What file should I create?", not the JSON.
+- Covers AE4. A truncated/garbled tool-call reply posts as readable prose, never
+ a JSON fragment.
+- Negative: a normal reply that mentions the words "ask_user" in a sentence is
+ posted unchanged (no false positive).
+- Negative: an empty reply / fallback "(agent finished without a text
+ response.)" path is unaffected.
+- A reply that is partly prose and partly a trailing tool-call shape extracts
+ the question without dropping meaningful prose (or posts clean text per the
+ chosen rule — pin the rule in the test).
+
+**Verification:** Unit tests pass; in a live session, force a narrated
+`ask_user` (e.g. before/without the extension) and confirm the chat shows a
+clean question, never JSON.
+
+---
+
+## Scope Boundaries
+
+**Deferred for later**
+- Multi-question batches (several questions in one prompt) — one question per
+ `awaiting_input`.
+- Surfacing prompts in the main chat timeline as a distinct message type — they
+ stay scoped to the agent thread (the U6 backstop's clean-text floor is the one
+ exception, and it is a sanitized chat message, not a thread prompt).
+
+**Outside this product's identity**
+- Any typed-prompt or backstop work inside the legacy `claude` executor — that
+ harness is being removed (see [[legacy-claude-harness-removal]] / origin). All
+ units here target the Pi path only.
+
+**Deferred to Follow-Up Work**
+- Extracting the shared `InstallPi` / `InstallTools` / `InstallPiExtension`
+ installer (residual-findings P2) — touch-adjacent to U5 but out of scope for
+ this plan unless it falls out naturally.
+
+---
+
+## Dependencies / Assumptions
+
+- **Pi `ctx.ui` API shape is unverified locally.** U1 must verify whether
+ `select`/`confirm` primitives exist; the fallback path keeps the feature
+ shippable either way.
+- **No frontend test runner exists yet.** U3 establishes vitest; U3/U4 test
+ scenarios depend on it.
+- **The answer transport is unchanged.** Choice/confirm answers ride the
+ existing `steer` WS path and resolve as `ExtensionUIResponse` keyed by the
+ tracked request id (KTD15), under the per-key lock (KTD9), clearing the
+ awaiting-input ceiling (KTD8). Verify the exact `response` value Pi expects for
+ select/confirm during U1.
+- **KTD constraints honored:** KTD2 (decoder tolerates drift), KTD6 (kind/options
+ stay in the AgentRunEvent seq family, not `session_update`), KTD14 (the answer
+ path stays membership-gated as today).
+
+---
+
+## Outstanding Questions
+
+**Deferred to Planning → resolved**
+- Backstop placement → Pi reply-finalize path (legacy harness removed).
+- Install-failure contract → loud (error log + session notice), non-fatal.
+
+**Deferred to Implementation**
+- Exact Pi `ctx.ui.select`/`confirm` request + response shapes (verify against
+ the live API in U1).
+- The precise lightest session-visible surface for the U5 install-failure notice.
+- The exact extraction rule for a mixed prose+tool-call reply in U6 (pin via
+ test).
+
+---
+
+## Sources & Research
+
+- Origin requirements: `docs/brainstorms/2026-06-08-interactive-agent-questions-requirements.md`
+- Pi harness KTDs: `docs/plans/2026-06-03-002-feat-pi-harness-integration-plan.md` (KTD15 ask-user mechanism, KTD8 ceiling, KTD9 lock, KTD6 event family)
+- Residual findings (test gaps, install dedup): `docs/residual-review-findings/feat-pi-harness-integration.md`
+- Answer-path trace: `server/internal/handler/websocket.go` (`handleSteer`), `server/internal/agent/runtime.go` (`RouteOrEnqueue`, `setPending`/`pendingRequest`/`clearPending`, `finalizeLocked`), `server/internal/agent/pirun/protocol.go` (`ExtensionUIResponse`)
+- Render sites: `src/components/super-threads/AgentThreadDrawer.tsx`, `AgentTaskCard.tsx`; reducer `src/stores/agent-runs.ts`; payload `server/internal/ws/events.go`
diff --git a/server/internal/agent/dbstore.go b/server/internal/agent/dbstore.go
index 98d4079..51164d4 100644
--- a/server/internal/agent/dbstore.go
+++ b/server/internal/agent/dbstore.go
@@ -135,13 +135,22 @@ func (s *DBStore) CompleteAction(ctx context.Context, sessionID, taskID, callID,
})
}
-func (s *DBStore) SetAwaitingInput(ctx context.Context, sessionID, taskID, question string) (int64, error) {
+func (s *DBStore) SetAwaitingInput(ctx context.Context, sessionID, taskID, question, kind string, options []string) (int64, error) {
tid, err := uuid.Parse(taskID)
if err != nil {
return 0, err
}
+ if options == nil {
+ options = []string{}
+ }
return s.withSeq(ctx, sessionID, func(q *db.Queries, seq int64) error {
- return q.SetTaskAwaitingInput(ctx, db.SetTaskAwaitingInputParams{ID: tid, PendingQuestion: question, Seq: seq})
+ return q.SetTaskAwaitingInput(ctx, db.SetTaskAwaitingInputParams{
+ ID: tid,
+ PendingQuestion: question,
+ PendingQuestionKind: kind,
+ PendingQuestionOptions: options,
+ Seq: seq,
+ })
})
}
diff --git a/server/internal/agent/pirun/decoder.go b/server/internal/agent/pirun/decoder.go
index a925157..a42ea54 100644
--- a/server/internal/agent/pirun/decoder.go
+++ b/server/internal/agent/pirun/decoder.go
@@ -60,6 +60,7 @@ type Event struct {
RequestID string
RequestKind string // select / confirm / input / editor
Prompt string
+ Options []string // choice labels for a select request (empty otherwise)
// Command reply (KindCommandReply).
Command string
@@ -191,13 +192,15 @@ func decodeUIRequest(line []byte) (Event, error) {
// extension (U12) lands; decode best-effort by id + common prompt/kind keys
// so the awaiting-input transition fires regardless of minor field naming.
var p struct {
- ID string `json:"id"`
- Method string `json:"method"`
- Kind string `json:"kind"`
- Prompt string `json:"prompt"`
- Params struct {
- Prompt string `json:"prompt"`
- Message string `json:"message"`
+ ID string `json:"id"`
+ Method string `json:"method"`
+ Kind string `json:"kind"`
+ Prompt string `json:"prompt"`
+ Options []string `json:"options"`
+ Params struct {
+ Prompt string `json:"prompt"`
+ Message string `json:"message"`
+ Options []string `json:"options"`
} `json:"params"`
}
if err := json.Unmarshal(line, &p); err != nil {
@@ -205,12 +208,17 @@ func decodeUIRequest(line []byte) (Event, error) {
}
prompt := firstNonEmpty(p.Prompt, p.Params.Prompt, p.Params.Message)
kind := firstNonEmpty(p.Kind, p.Method)
+ options := p.Options
+ if len(options) == 0 {
+ options = p.Params.Options
+ }
return Event{
Kind: KindAwaitingInput,
RawType: "extension_ui_request",
RequestID: p.ID,
RequestKind: kind,
Prompt: prompt,
+ Options: options,
}, nil
}
diff --git a/server/internal/agent/pirun/decoder_test.go b/server/internal/agent/pirun/decoder_test.go
index fc77a94..f641dda 100644
--- a/server/internal/agent/pirun/decoder_test.go
+++ b/server/internal/agent/pirun/decoder_test.go
@@ -139,6 +139,34 @@ func TestDecodeExtensionUIRequest(t *testing.T) {
if ev.RequestID != "ui-7" || ev.RequestKind != "input" || ev.Prompt != "Which environment?" {
t.Errorf("ui request decoded as %+v", ev)
}
+ if len(ev.Options) != 0 {
+ t.Errorf("free-text request should carry no options, got %v", ev.Options)
+ }
+}
+
+func TestDecodeExtensionUIRequestSelectOptions(t *testing.T) {
+ // A select-kind request carries choice options; decode them best-effort
+ // whether they ride top-level or under params.
+ ev, err := Decode([]byte(`{"type":"extension_ui_request","id":"ui-9","kind":"select","prompt":"Which framework?","options":["React","Vue","Svelte"]}`))
+ if err != nil {
+ t.Fatalf("decode select request: %v", err)
+ }
+ if ev.Kind != KindAwaitingInput || ev.RequestKind != "select" {
+ t.Fatalf("kind=%q requestKind=%q, want awaiting_input/select", ev.Kind, ev.RequestKind)
+ }
+ if got := ev.Options; len(got) != 3 || got[0] != "React" || got[2] != "Svelte" {
+ t.Errorf("options = %v, want [React Vue Svelte]", got)
+ }
+}
+
+func TestDecodeExtensionUIRequestParamsOptions(t *testing.T) {
+ ev, err := Decode([]byte(`{"type":"extension_ui_request","id":"ui-10","method":"select","params":{"prompt":"Pick","options":["a","b"]}}`))
+ if err != nil {
+ t.Fatalf("decode: %v", err)
+ }
+ if len(ev.Options) != 2 || ev.Options[1] != "b" || ev.Prompt != "Pick" {
+ t.Errorf("params-options request decoded as %+v", ev)
+ }
}
func TestNormalizeTool(t *testing.T) {
diff --git a/server/internal/agent/pirun/extension/ask-user.ts b/server/internal/agent/pirun/extension/ask-user.ts
index 0914d4f..68bf53f 100644
--- a/server/internal/agent/pirun/extension/ask-user.ts
+++ b/server/internal/agent/pirun/extension/ask-user.ts
@@ -2,12 +2,21 @@
//
// Pi has no native "agent is waiting on the human" event (verified in the U1
// spike). This extension gives the agent a blocking `ask_user` tool: when the
-// agent calls it, ctx.ui.input emits an `extension_ui_request` on the RPC
+// agent calls it, a ctx.ui primitive emits an `extension_ui_request` on the RPC
// stdout stream and blocks until the client sends a matching
// `extension_ui_response`. The Deuce runtime maps that request to the task's
// `awaiting_input` state and routes the human's drawer reply back as the
// response (KTD15 / R7 / R16 / AE3).
//
+// The tool optionally carries a `kind` (free-text / pick-one / confirm) and,
+// for choice kinds, an `options` list, so the client can render a typed prompt
+// (text field / buttons / yes-no) instead of a bare text box. `kind`/`options`
+// are additive: omitting them preserves the original free-text behavior. The
+// richer ctx.ui primitives (select/confirm) are feature-detected at runtime —
+// when the running Pi build does not expose them, the tool falls back to
+// ctx.ui.input with the options enumerated in the prompt. Either way it returns
+// the answer as plain text and never emits raw JSON to the user.
+//
// Auto-discovered when placed at ~/.pi/agent/extensions/ in the container.
import type { ExtensionAPI } from "@earendil-works/pi-coding-agent";
@@ -21,11 +30,31 @@ export default function (pi: ExtensionAPI) {
"Ask the human a clarifying question and block until they answer. " +
"Use this whenever you are blocked on a decision only the user can make " +
"(ambiguous requirements, a risky action needing approval, missing " +
- "context) instead of guessing. Returns the user's answer as text.",
+ "context) instead of guessing. " +
+ "Set kind to 'select' and provide options when the answer is one of a " +
+ "small set of choices, or kind 'confirm' for a yes/no decision; omit " +
+ "kind for an open-ended text answer. Returns the user's answer as text.",
parameters: Type.Object({
question: Type.String({
description: "The question to ask the user, phrased clearly.",
}),
+ kind: Type.Optional(
+ Type.Union([
+ Type.Literal("input"),
+ Type.Literal("select"),
+ Type.Literal("confirm"),
+ ], {
+ description:
+ "How the user answers: 'input' (free text, default), 'select' " +
+ "(pick one of options), or 'confirm' (yes/no).",
+ }),
+ ),
+ options: Type.Optional(
+ Type.Array(Type.String(), {
+ description:
+ "Choices to offer when kind is 'select'. Ignored for other kinds.",
+ }),
+ ),
}),
async execute(toolCallId, params, signal, onUpdate, ctx) {
// In headless contexts with no UI channel, don't block forever — tell the
@@ -42,9 +71,51 @@ export default function (pi: ExtensionAPI) {
};
}
- const answer = await ctx.ui.input("A question for you", params.question);
+ const ui = ctx.ui as Record;
+ const options = Array.isArray(params.options) ? params.options : [];
+ // Infer select when options were supplied without an explicit kind.
+ const kind =
+ params.kind ?? (options.length > 0 ? "select" : "input");
+
+ const text = (answer: unknown): string =>
+ answer == null ? "" : String(answer);
+
+ let answer: unknown;
+ if (kind === "select" && options.length > 0) {
+ if (typeof ui.select === "function") {
+ answer = await (ui.select as (
+ title: string,
+ prompt: string,
+ options: string[],
+ ) => Promise)("A question for you", params.question, options);
+ } else {
+ // Fallback: enumerate the options in a text prompt. The answer is
+ // still plain text — never JSON.
+ const list = options.map((o, i) => `${i + 1}. ${o}`).join("\n");
+ answer = await ctx.ui.input(
+ "A question for you",
+ `${params.question}\n\nOptions:\n${list}`,
+ );
+ }
+ } else if (kind === "confirm") {
+ if (typeof ui.confirm === "function") {
+ const ok = await (ui.confirm as (
+ title: string,
+ prompt: string,
+ ) => Promise)("A question for you", params.question);
+ answer = ok ? "yes" : "no";
+ } else {
+ answer = await ctx.ui.input(
+ "A question for you",
+ `${params.question} (yes/no)`,
+ );
+ }
+ } else {
+ answer = await ctx.ui.input("A question for you", params.question);
+ }
+
return {
- content: [{ type: "text", text: answer ?? "" }],
+ content: [{ type: "text", text: text(answer) }],
details: {},
};
},
diff --git a/server/internal/agent/question_backstop.go b/server/internal/agent/question_backstop.go
new file mode 100644
index 0000000..ebda23a
--- /dev/null
+++ b/server/internal/agent/question_backstop.go
@@ -0,0 +1,98 @@
+package agent
+
+import (
+ "encoding/json"
+ "regexp"
+ "strings"
+)
+
+// The no-JSON backstop (R9/R11/R12). When the structured extension_ui_request
+// never fires — the ask-user extension failed to install, or the model narrated
+// the call instead of invoking the tool — the question arrives as assistant text
+// shaped like `ask_user({"question":"..."})`. Left alone it posts to chat as raw
+// JSON, which reads as a broken product. sanitizeNarratedQuestion rewrites that
+// text into the plain question before it is persisted, broadcast, or posted.
+var (
+ // Leak shape: an ask_user call opening with a JSON object. Requires `({`
+ // so a bare prose mention of "ask_user" never matches. Catches truncated
+ // calls too (no closing required here — that's the AE4 floor).
+ askUserLeakRe = regexp.MustCompile(`(?is)ask_user\s*\(\s*\{`)
+ // A complete `ask_user({...})` call, captured so surrounding prose survives.
+ askUserCallRe = regexp.MustCompile(`(?is)ask_user\s*\(\s*\{.*\}\s*\)`)
+ // The JSON object within a matched call.
+ askUserObjRe = regexp.MustCompile(`(?s)\{.*\}`)
+ // Best-effort "question" value extraction when the object won't parse as
+ // JSON (handles a closed string even if the call braces are truncated).
+ askUserQuestionRe = regexp.MustCompile(`(?is)"question"\s*:\s*"((?:[^"\\]|\\.)*)"`)
+)
+
+const malformedQuestionFloor = "(The agent tried to ask you a question, but the request was malformed.)"
+
+// looksLikeNarratedQuestion reports whether reply contains an ask_user tool call
+// rendered as text rather than a normal prose reply.
+func looksLikeNarratedQuestion(reply string) bool {
+ return askUserLeakRe.MatchString(reply)
+}
+
+// sanitizeNarratedQuestion turns a narrated ask_user call into the plain
+// question text. A complete call is replaced in place so any surrounding prose
+// is preserved; a truncated/garbled call degrades to the extracted question or,
+// failing that, a readable placeholder — never a JSON fragment. A reply with no
+// leak shape is returned unchanged.
+func sanitizeNarratedQuestion(reply string) string {
+ if !looksLikeNarratedQuestion(reply) {
+ return reply
+ }
+ if askUserCallRe.MatchString(reply) {
+ out := strings.TrimSpace(askUserCallRe.ReplaceAllStringFunc(reply, replaceNarratedCall))
+ if out != "" {
+ return out
+ }
+ }
+ // Truncated / garbled call (no complete ({...}) to replace): salvage the
+ // question if a closed "question" string survives, else the floor.
+ if q := extractQuestionText(reply); q != "" {
+ return q
+ }
+ return malformedQuestionFloor
+}
+
+// replaceNarratedCall maps one complete ask_user(...) call to its question text,
+// or the floor when the object can't yield a question.
+func replaceNarratedCall(match string) string {
+ obj := askUserObjRe.FindString(match)
+ if obj != "" {
+ var parsed struct {
+ Question string `json:"question"`
+ }
+ if err := json.Unmarshal([]byte(obj), &parsed); err == nil {
+ if q := strings.TrimSpace(parsed.Question); q != "" {
+ return q
+ }
+ }
+ }
+ if q := extractQuestionText(match); q != "" {
+ return q
+ }
+ return malformedQuestionFloor
+}
+
+// extractQuestionText pulls a "question":"..." value out of arbitrary text by
+// regex (a fallback for malformed JSON) and unescapes it.
+func extractQuestionText(s string) string {
+ m := askUserQuestionRe.FindStringSubmatch(s)
+ if m == nil {
+ return ""
+ }
+ return strings.TrimSpace(unescapeJSONString(m[1]))
+}
+
+// unescapeJSONString decodes JSON string escapes (\n, \", \\, …) in a raw
+// captured value, falling back to the input when it isn't decodable.
+func unescapeJSONString(s string) string {
+ var out string
+ if err := json.Unmarshal([]byte(`"`+s+`"`), &out); err == nil {
+ return out
+ }
+ return s
+}
diff --git a/server/internal/agent/question_backstop_test.go b/server/internal/agent/question_backstop_test.go
new file mode 100644
index 0000000..bbb637e
--- /dev/null
+++ b/server/internal/agent/question_backstop_test.go
@@ -0,0 +1,79 @@
+package agent
+
+import "testing"
+
+func TestSanitizeNarratedQuestion(t *testing.T) {
+ cases := []struct {
+ name string
+ reply string
+ want string
+ }{
+ {
+ // AE3: the canonical leak — a bare narrated call posts as the question.
+ name: "bare call",
+ reply: `Ask_user({"question":"What file should I create?"})`,
+ want: "What file should I create?",
+ },
+ {
+ // Lowercase tool name, the literal extension name.
+ name: "lowercase call",
+ reply: `ask_user({"question": "Which framework?"})`,
+ want: "Which framework?",
+ },
+ {
+ // Escaped newlines in the question are decoded, not shown raw.
+ name: "escaped multiline question",
+ reply: `Ask_user({"question":"What file would you like me to create? Please provide:\n1. The filename\n2. The content"})`,
+ want: "What file would you like me to create? Please provide:\n1. The filename\n2. The content",
+ },
+ {
+ // Leading prose is preserved; only the call shape is rewritten.
+ name: "prose then call",
+ reply: `Sure — let me check. ask_user({"question":"Which env?"})`,
+ want: "Sure — let me check. Which env?",
+ },
+ {
+ // AE4: truncated call still yields the question (closed string), no JSON.
+ name: "truncated but quoted",
+ reply: `ask_user({"question":"What file?"`,
+ want: "What file?",
+ },
+ }
+ for _, tc := range cases {
+ t.Run(tc.name, func(t *testing.T) {
+ if got := sanitizeNarratedQuestion(tc.reply); got != tc.want {
+ t.Errorf("sanitize(%q) = %q, want %q", tc.reply, got, tc.want)
+ }
+ })
+ }
+}
+
+func TestSanitizeNarratedQuestionFloor(t *testing.T) {
+ // AE4: a garbled call with no recoverable question degrades to readable
+ // prose, never a JSON fragment.
+ got := sanitizeNarratedQuestion(`ask_user({"questi`)
+ if got != malformedQuestionFloor {
+ t.Errorf("garbled call = %q, want the readable floor", got)
+ }
+ if containsJSONFragment(got) {
+ t.Errorf("floor should not contain a JSON fragment: %q", got)
+ }
+}
+
+func TestSanitizeNarratedQuestionPassthrough(t *testing.T) {
+ cases := []string{
+ "",
+ "(The agent finished without a text response.)",
+ "I considered using ask_user to confirm, but proceeded with the default.",
+ "Here is your summary: all tests pass and the build is green.",
+ }
+ for _, reply := range cases {
+ if got := sanitizeNarratedQuestion(reply); got != reply {
+ t.Errorf("passthrough reply changed: sanitize(%q) = %q", reply, got)
+ }
+ }
+}
+
+func containsJSONFragment(s string) bool {
+ return len(s) > 0 && (s[0] == '{' || s[len(s)-1] == '}')
+}
diff --git a/server/internal/agent/runtime.go b/server/internal/agent/runtime.go
index c28d3e3..3a0018c 100644
--- a/server/internal/agent/runtime.go
+++ b/server/internal/agent/runtime.go
@@ -328,7 +328,7 @@ func (r *Runtime) translate(key pirun.Key, ev pirun.Event) {
case pirun.KindAssistantText:
r.appendReply(taskID, ev.Text)
case pirun.KindAwaitingInput:
- seq, err := r.store.SetAwaitingInput(ctx, key.SessionID, taskID, ev.Prompt)
+ seq, err := r.store.SetAwaitingInput(ctx, key.SessionID, taskID, ev.Prompt, ev.RequestKind, ev.Options)
if err != nil {
slog.Error("runtime: set awaiting input", "task", taskID, "error", err)
return
@@ -336,7 +336,8 @@ func (r *Runtime) translate(key pirun.Key, ev pirun.Event) {
r.setPending(taskID, ev.RequestID)
r.enterAwaiting(key, taskID) // suspend active timeout, start ceiling (KTD8)
r.broadcastTask(ws.TypeTaskAwaitingInput, ws.TaskEventPayload{
- Seq: seq, TaskID: taskID, AgentID: key.AgentID, State: StateAwaitingInput, PendingQuestion: ev.Prompt,
+ Seq: seq, TaskID: taskID, AgentID: key.AgentID, State: StateAwaitingInput,
+ PendingQuestion: ev.Prompt, PendingQuestionKind: ev.RequestKind, PendingQuestionOptions: ev.Options,
}, key.SessionID)
case pirun.KindRunCompleted:
unlock := r.keys.lock(key)
@@ -389,6 +390,10 @@ func (r *Runtime) finalizeLocked(ctx context.Context, key pirun.Key, taskID, sta
if ok && isTerminal(cur) {
return // already terminal — second signal is a no-op (idempotent, KTD12)
}
+ // No-JSON backstop: if the agent narrated an ask_user tool call as text
+ // (the ask-user extension didn't fire), rewrite it to the plain question
+ // before it is persisted, broadcast, and posted to chat (R9/R11/R12).
+ reply = sanitizeNarratedQuestion(reply)
seq, err := r.store.FinishTask(ctx, key.SessionID, taskID, state, reply)
if err != nil {
slog.Error("runtime: finish task", "task", taskID, "state", state, "error", err)
diff --git a/server/internal/agent/runtime_test.go b/server/internal/agent/runtime_test.go
index 98ab8ec..dc54976 100644
--- a/server/internal/agent/runtime_test.go
+++ b/server/internal/agent/runtime_test.go
@@ -66,7 +66,7 @@ func (s *fakeStore) setState(sessionID, taskID, state string) int64 {
func (s *fakeStore) MarkRunning(_ context.Context, sessionID, taskID string) (int64, error) {
return s.setState(sessionID, taskID, StateRunning), nil
}
-func (s *fakeStore) SetAwaitingInput(_ context.Context, sessionID, taskID, _ string) (int64, error) {
+func (s *fakeStore) SetAwaitingInput(_ context.Context, sessionID, taskID, _, _ string, _ []string) (int64, error) {
return s.setState(sessionID, taskID, StateAwaitingInput), nil
}
func (s *fakeStore) ResolveAwaitingInput(_ context.Context, sessionID, taskID string) (int64, error) {
diff --git a/server/internal/agent/store.go b/server/internal/agent/store.go
index bafad31..8381040 100644
--- a/server/internal/agent/store.go
+++ b/server/internal/agent/store.go
@@ -27,8 +27,10 @@ type Store interface {
CompleteAction(ctx context.Context, sessionID, taskID, callID, text string, isError bool) (seq int64, err error)
// SetAwaitingInput transitions running→awaiting_input with the pending
- // question and returns the event seq.
- SetAwaitingInput(ctx context.Context, sessionID, taskID, question string) (seq int64, err error)
+ // question and returns the event seq. kind is the question type (input /
+ // select / confirm; empty means free-text input) and options are the choice
+ // labels for a select question (nil otherwise).
+ SetAwaitingInput(ctx context.Context, sessionID, taskID, question, kind string, options []string) (seq int64, err error)
// ResolveAwaitingInput transitions awaiting_input→running and returns the seq.
ResolveAwaitingInput(ctx context.Context, sessionID, taskID string) (seq int64, err error)
diff --git a/server/internal/db/migrations/012_task_question_kind.sql b/server/internal/db/migrations/012_task_question_kind.sql
new file mode 100644
index 0000000..0051a07
--- /dev/null
+++ b/server/internal/db/migrations/012_task_question_kind.sql
@@ -0,0 +1,15 @@
+-- +goose Up
+
+-- Typed agent questions: a question carries a kind (free-text / pick-one /
+-- confirm) and, for pick-one, the offered options, so the client can render a
+-- typed prompt instead of a bare text box. Persisted alongside pending_question
+-- so a snapshot refetch (seq-gap reconcile, reconnect) reconstructs the typed
+-- prompt rather than degrading it to free text. Empty kind ('') means free-text
+-- input — the backward-compatible default for questions that predate this
+-- column or omit the kind.
+ALTER TABLE tasks ADD COLUMN pending_question_kind TEXT NOT NULL DEFAULT '';
+ALTER TABLE tasks ADD COLUMN pending_question_options TEXT[] NOT NULL DEFAULT '{}';
+
+-- +goose Down
+ALTER TABLE tasks DROP COLUMN pending_question_options;
+ALTER TABLE tasks DROP COLUMN pending_question_kind;
\ No newline at end of file
diff --git a/server/internal/db/models.go b/server/internal/db/models.go
index 646657b..ed895a2 100644
--- a/server/internal/db/models.go
+++ b/server/internal/db/models.go
@@ -88,19 +88,21 @@ type SessionMember struct {
}
type Task struct {
- ID uuid.UUID `json:"id"`
- SessionID uuid.UUID `json:"session_id"`
- AgentID uuid.UUID `json:"agent_id"`
- RequestedBy pgtype.UUID `json:"requested_by"`
- AnchorMessageID pgtype.UUID `json:"anchor_message_id"`
- Prompt string `json:"prompt"`
- State string `json:"state"`
- Seq int64 `json:"seq"`
- PendingQuestion string `json:"pending_question"`
- Reply string `json:"reply"`
- Work []byte `json:"work"`
- CreatedAt time.Time `json:"created_at"`
- UpdatedAt time.Time `json:"updated_at"`
+ ID uuid.UUID `json:"id"`
+ SessionID uuid.UUID `json:"session_id"`
+ AgentID uuid.UUID `json:"agent_id"`
+ RequestedBy pgtype.UUID `json:"requested_by"`
+ AnchorMessageID pgtype.UUID `json:"anchor_message_id"`
+ Prompt string `json:"prompt"`
+ State string `json:"state"`
+ Seq int64 `json:"seq"`
+ PendingQuestion string `json:"pending_question"`
+ Reply string `json:"reply"`
+ Work []byte `json:"work"`
+ CreatedAt time.Time `json:"created_at"`
+ UpdatedAt time.Time `json:"updated_at"`
+ PendingQuestionKind string `json:"pending_question_kind"`
+ PendingQuestionOptions []string `json:"pending_question_options"`
}
type TaskAction struct {
diff --git a/server/internal/db/queries/tasks.sql b/server/internal/db/queries/tasks.sql
index 240b519..d21cf09 100644
--- a/server/internal/db/queries/tasks.sql
+++ b/server/internal/db/queries/tasks.sql
@@ -25,13 +25,13 @@ SELECT * FROM tasks WHERE id = $1;
UPDATE tasks SET state = $2, seq = $3, updated_at = now() WHERE id = $1;
-- name: SetTaskAwaitingInput :exec
-UPDATE tasks SET state = 'awaiting_input', pending_question = $2, seq = $3, updated_at = now() WHERE id = $1;
+UPDATE tasks SET state = 'awaiting_input', pending_question = $2, pending_question_kind = $3, pending_question_options = $4, seq = $5, updated_at = now() WHERE id = $1;
-- name: ResolveTaskInput :exec
-UPDATE tasks SET state = 'running', pending_question = '', seq = $2, updated_at = now() WHERE id = $1;
+UPDATE tasks SET state = 'running', pending_question = '', pending_question_kind = '', pending_question_options = '{}', seq = $2, updated_at = now() WHERE id = $1;
-- name: FinishTask :exec
-UPDATE tasks SET state = $2, reply = $3, work = $4, pending_question = '', seq = $5, updated_at = now() WHERE id = $1;
+UPDATE tasks SET state = $2, reply = $3, work = $4, pending_question = '', pending_question_kind = '', pending_question_options = '{}', seq = $5, updated_at = now() WHERE id = $1;
-- name: AppendAction :exec
-- Idempotent on (task_id, call_id): a replayed tool-start after re-attach is a
diff --git a/server/internal/db/tasks.sql.go b/server/internal/db/tasks.sql.go
index cab52bb..2a5201a 100644
--- a/server/internal/db/tasks.sql.go
+++ b/server/internal/db/tasks.sql.go
@@ -110,7 +110,7 @@ func (q *Queries) CompleteAction(ctx context.Context, arg CompleteActionParams)
const createTask = `-- name: CreateTask :one
INSERT INTO tasks (session_id, agent_id, requested_by, anchor_message_id, prompt, state, seq)
VALUES ($1, $2, $3, $4, $5, $6, $7)
-RETURNING id, session_id, agent_id, requested_by, anchor_message_id, prompt, state, seq, pending_question, reply, work, created_at, updated_at
+RETURNING id, session_id, agent_id, requested_by, anchor_message_id, prompt, state, seq, pending_question, reply, work, created_at, updated_at, pending_question_kind, pending_question_options
`
type CreateTaskParams struct {
@@ -148,6 +148,8 @@ func (q *Queries) CreateTask(ctx context.Context, arg CreateTaskParams) (Task, e
&i.Work,
&i.CreatedAt,
&i.UpdatedAt,
+ &i.PendingQuestionKind,
+ &i.PendingQuestionOptions,
)
return i, err
}
@@ -164,7 +166,7 @@ func (q *Queries) FailStuckTasks(ctx context.Context) error {
}
const finishTask = `-- name: FinishTask :exec
-UPDATE tasks SET state = $2, reply = $3, work = $4, pending_question = '', seq = $5, updated_at = now() WHERE id = $1
+UPDATE tasks SET state = $2, reply = $3, work = $4, pending_question = '', pending_question_kind = '', pending_question_options = '{}', seq = $5, updated_at = now() WHERE id = $1
`
type FinishTaskParams struct {
@@ -214,7 +216,7 @@ func (q *Queries) GetPiSessionID(ctx context.Context, arg GetPiSessionIDParams)
}
const getTask = `-- name: GetTask :one
-SELECT id, session_id, agent_id, requested_by, anchor_message_id, prompt, state, seq, pending_question, reply, work, created_at, updated_at FROM tasks WHERE id = $1
+SELECT id, session_id, agent_id, requested_by, anchor_message_id, prompt, state, seq, pending_question, reply, work, created_at, updated_at, pending_question_kind, pending_question_options FROM tasks WHERE id = $1
`
func (q *Queries) GetTask(ctx context.Context, id uuid.UUID) (Task, error) {
@@ -234,6 +236,8 @@ func (q *Queries) GetTask(ctx context.Context, id uuid.UUID) (Task, error) {
&i.Work,
&i.CreatedAt,
&i.UpdatedAt,
+ &i.PendingQuestionKind,
+ &i.PendingQuestionOptions,
)
return i, err
}
@@ -258,7 +262,7 @@ func (q *Queries) IsSessionMember(ctx context.Context, arg IsSessionMemberParams
}
const listAgentTasks = `-- name: ListAgentTasks :many
-SELECT id, session_id, agent_id, requested_by, anchor_message_id, prompt, state, seq, pending_question, reply, work, created_at, updated_at FROM tasks WHERE session_id = $1 AND agent_id = $2 ORDER BY created_at ASC
+SELECT id, session_id, agent_id, requested_by, anchor_message_id, prompt, state, seq, pending_question, reply, work, created_at, updated_at, pending_question_kind, pending_question_options FROM tasks WHERE session_id = $1 AND agent_id = $2 ORDER BY created_at ASC
`
type ListAgentTasksParams struct {
@@ -291,6 +295,8 @@ func (q *Queries) ListAgentTasks(ctx context.Context, arg ListAgentTasksParams)
&i.Work,
&i.CreatedAt,
&i.UpdatedAt,
+ &i.PendingQuestionKind,
+ &i.PendingQuestionOptions,
); err != nil {
return nil, err
}
@@ -346,7 +352,7 @@ func (q *Queries) ListSessionTaskActions(ctx context.Context, sessionID uuid.UUI
}
const listSessionTasks = `-- name: ListSessionTasks :many
-SELECT id, session_id, agent_id, requested_by, anchor_message_id, prompt, state, seq, pending_question, reply, work, created_at, updated_at FROM tasks WHERE session_id = $1 ORDER BY created_at ASC
+SELECT id, session_id, agent_id, requested_by, anchor_message_id, prompt, state, seq, pending_question, reply, work, created_at, updated_at, pending_question_kind, pending_question_options FROM tasks WHERE session_id = $1 ORDER BY created_at ASC
`
func (q *Queries) ListSessionTasks(ctx context.Context, sessionID uuid.UUID) ([]Task, error) {
@@ -372,6 +378,8 @@ func (q *Queries) ListSessionTasks(ctx context.Context, sessionID uuid.UUID) ([]
&i.Work,
&i.CreatedAt,
&i.UpdatedAt,
+ &i.PendingQuestionKind,
+ &i.PendingQuestionOptions,
); err != nil {
return nil, err
}
@@ -435,7 +443,7 @@ func (q *Queries) PeekEventSeq(ctx context.Context, sessionID uuid.UUID) (int64,
}
const resolveTaskInput = `-- name: ResolveTaskInput :exec
-UPDATE tasks SET state = 'running', pending_question = '', seq = $2, updated_at = now() WHERE id = $1
+UPDATE tasks SET state = 'running', pending_question = '', pending_question_kind = '', pending_question_options = '{}', seq = $2, updated_at = now() WHERE id = $1
`
type ResolveTaskInputParams struct {
@@ -449,17 +457,25 @@ func (q *Queries) ResolveTaskInput(ctx context.Context, arg ResolveTaskInputPara
}
const setTaskAwaitingInput = `-- name: SetTaskAwaitingInput :exec
-UPDATE tasks SET state = 'awaiting_input', pending_question = $2, seq = $3, updated_at = now() WHERE id = $1
+UPDATE tasks SET state = 'awaiting_input', pending_question = $2, pending_question_kind = $3, pending_question_options = $4, seq = $5, updated_at = now() WHERE id = $1
`
type SetTaskAwaitingInputParams struct {
- ID uuid.UUID `json:"id"`
- PendingQuestion string `json:"pending_question"`
- Seq int64 `json:"seq"`
+ ID uuid.UUID `json:"id"`
+ PendingQuestion string `json:"pending_question"`
+ PendingQuestionKind string `json:"pending_question_kind"`
+ PendingQuestionOptions []string `json:"pending_question_options"`
+ Seq int64 `json:"seq"`
}
func (q *Queries) SetTaskAwaitingInput(ctx context.Context, arg SetTaskAwaitingInputParams) error {
- _, err := q.db.Exec(ctx, setTaskAwaitingInput, arg.ID, arg.PendingQuestion, arg.Seq)
+ _, err := q.db.Exec(ctx, setTaskAwaitingInput,
+ arg.ID,
+ arg.PendingQuestion,
+ arg.PendingQuestionKind,
+ arg.PendingQuestionOptions,
+ arg.Seq,
+ )
return err
}
diff --git a/server/internal/handler/agent_run.go b/server/internal/handler/agent_run.go
index 754d2db..d4ad737 100644
--- a/server/internal/handler/agent_run.go
+++ b/server/internal/handler/agent_run.go
@@ -26,17 +26,19 @@ type agentActionResp struct {
}
type agentTaskResp struct {
- ID string `json:"id"`
- SessionID string `json:"sessionId"`
- AgentID string `json:"agentId"`
- RequestedBy string `json:"requestedBy,omitempty"`
- AnchorMessageID string `json:"anchorMessageId,omitempty"`
- Prompt string `json:"prompt"`
- State string `json:"state"`
- Seq int64 `json:"seq"`
- PendingQuestion string `json:"pendingQuestion,omitempty"`
- Reply string `json:"reply,omitempty"`
- Actions []agentActionResp `json:"actions"`
+ ID string `json:"id"`
+ SessionID string `json:"sessionId"`
+ AgentID string `json:"agentId"`
+ RequestedBy string `json:"requestedBy,omitempty"`
+ AnchorMessageID string `json:"anchorMessageId,omitempty"`
+ Prompt string `json:"prompt"`
+ State string `json:"state"`
+ Seq int64 `json:"seq"`
+ PendingQuestion string `json:"pendingQuestion,omitempty"`
+ PendingQuestionKind string `json:"pendingQuestionKind,omitempty"`
+ PendingQuestionOptions []string `json:"pendingQuestionOptions,omitempty"`
+ Reply string `json:"reply,omitempty"`
+ Actions []agentActionResp `json:"actions"`
}
type agentRunSnapshotResp struct {
@@ -123,7 +125,10 @@ func buildSnapshot(tasks []db.Task, actions []db.TaskAction) agentRunSnapshotRes
RequestedBy: uuidStr(t.RequestedBy.Bytes, t.RequestedBy.Valid),
AnchorMessageID: uuidStr(t.AnchorMessageID.Bytes, t.AnchorMessageID.Valid),
Prompt: t.Prompt, State: t.State, Seq: t.Seq,
- PendingQuestion: t.PendingQuestion, Reply: t.Reply, Actions: acts,
+ PendingQuestion: t.PendingQuestion,
+ PendingQuestionKind: t.PendingQuestionKind,
+ PendingQuestionOptions: t.PendingQuestionOptions,
+ Reply: t.Reply, Actions: acts,
})
}
resp.LatestSeq = latest
diff --git a/server/internal/handler/workspace.go b/server/internal/handler/workspace.go
index 82ffd37..f75e8b3 100644
--- a/server/internal/handler/workspace.go
+++ b/server/internal/handler/workspace.go
@@ -27,7 +27,10 @@ func (h *Handler) provisionAgentTools(ctx context.Context, workspaceID string, l
slog.Warn("pi installation failed", "workspace", workspaceID, "error", err)
}
if err := h.workspaces.InstallPiExtension(ctx, workspaceID, extension.AskUserFilename, extension.AskUser, logFn); err != nil {
- slog.Warn("pi extension installation failed", "workspace", workspaceID, "error", err)
+ // Loud, not fatal: the workspace still comes up, but the user has been
+ // told via logFn that agents can't ask questions here (R10). Error level
+ // so it stands out from routine provisioning warnings.
+ slog.Error("pi extension installation failed", "workspace", workspaceID, "error", err)
}
}
diff --git a/server/internal/workspace/manager.go b/server/internal/workspace/manager.go
index f3d5e1f..9687e8a 100644
--- a/server/internal/workspace/manager.go
+++ b/server/internal/workspace/manager.go
@@ -591,9 +591,14 @@ func (m *Manager) symlinkPi(ctx context.Context, workspaceID string) {
// InstallPiExtension writes a Pi extension file to the container's auto-discovery
// path (~/.pi/agent/extensions/) so Pi loads it on launch. Content is
-// base64-encoded over the wire to avoid any shell-quoting hazards. Non-fatal:
-// without the extension, agents simply lose the ask-user (awaiting-input)
-// capability rather than failing the workspace.
+// base64-encoded over the wire to avoid any shell-quoting hazards.
+//
+// A failure is loud, not silent: without the ask-user extension Pi has no way to
+// surface a blocking question, so the agent narrates the call as plain text
+// (raw `ask_user(...)` in the chat) or guesses instead of asking. The failure is
+// logged at error level and surfaced to the user through logFn with the concrete
+// consequence, and the error is returned so the caller can react — but
+// provisioning stays non-fatal at the call site (the workspace still comes up).
func (m *Manager) InstallPiExtension(ctx context.Context, workspaceID, filename, content string, logFn LogFunc) error {
encoded := base64.StdEncoding.EncodeToString([]byte(content))
cmd := fmt.Sprintf(
@@ -603,10 +608,10 @@ func (m *Manager) InstallPiExtension(ctx context.Context, workspaceID, filename,
out, err := m.ExecInWorkspace(ctx, workspaceID, cmd).CombinedOutput()
if err != nil {
if logFn != nil {
- logFn("WARNING: failed to install Pi ask-user extension")
+ logFn("ERROR: failed to install the Pi ask-user extension — agents in this session cannot ask you questions and may proceed on assumptions instead. Rebuild the workspace to retry.")
}
- slog.Warn("pi extension install failed", "workspace", workspaceID, "error", err, "output", strings.TrimSpace(string(out)))
- return nil // Non-fatal
+ slog.Error("pi extension install failed", "workspace", workspaceID, "error", err, "output", strings.TrimSpace(string(out)))
+ return fmt.Errorf("install pi ask-user extension %q: %w", filename, err)
}
if logFn != nil {
logFn(fmt.Sprintf("Pi extension installed: %s", filename))
diff --git a/server/internal/ws/events.go b/server/internal/ws/events.go
index 4bed373..ba9f9df 100644
--- a/server/internal/ws/events.go
+++ b/server/internal/ws/events.go
@@ -61,8 +61,12 @@ type TaskEventPayload struct {
State string `json:"state,omitempty"`
Position int `json:"position,omitempty"` // queue #N for queued tasks
PendingQuestion string `json:"pendingQuestion,omitempty"` // awaiting_input
- Reply string `json:"reply,omitempty"` // completed
- Status string `json:"status,omitempty"` // completed: done|failed|cancelled
+ // Typed-question metadata (awaiting_input): kind is input|select|confirm
+ // (empty means free-text input); options are the choice labels for a select.
+ PendingQuestionKind string `json:"pendingQuestionKind,omitempty"`
+ PendingQuestionOptions []string `json:"pendingQuestionOptions,omitempty"`
+ Reply string `json:"reply,omitempty"` // completed
+ Status string `json:"status,omitempty"` // completed: done|failed|cancelled
}
// ActionEventPayload is the JSON payload for action_started / action_completed.
diff --git a/src/components/super-threads/AgentThreadDrawer.tsx b/src/components/super-threads/AgentThreadDrawer.tsx
index 0ffd08b..d674105 100644
--- a/src/components/super-threads/AgentThreadDrawer.tsx
+++ b/src/components/super-threads/AgentThreadDrawer.tsx
@@ -51,16 +51,67 @@ function RequesterAvatar({ user }: { user?: Pick }) {
);
}
+// QuestionControls renders the typed-prompt affordances for an awaiting-input
+// task: choice buttons for a select question, yes/no for a confirm, and nothing
+// extra for free text (the drawer composer below is the text input, and also
+// serves as the "Other" fallback for a select). Answering routes through the
+// same steer path as a typed reply (onAnswer → onSend → steer).
+function QuestionControls({
+ task,
+ onAnswer,
+}: {
+ task: AgentTask;
+ onAnswer: (message: string) => void;
+}) {
+ const kind = task.pendingQuestionKind;
+ const options = task.pendingQuestionOptions ?? [];
+
+ if (kind === "select" && options.length > 0) {
+ return (
+
+ {options.map((opt) => (
+
+ ))}
+ or type another answer below
+