Skip to content

feat(runtime): bridge user-input events and API to external GUI clients#2133

Open
gaord wants to merge 5 commits into
Hmbown:mainfrom
gaord:fix/user-input-runtime-bridge
Open

feat(runtime): bridge user-input events and API to external GUI clients#2133
gaord wants to merge 5 commits into
Hmbown:mainfrom
gaord:fix/user-input-runtime-bridge

Conversation

@gaord
Copy link
Copy Markdown
Contributor

@gaord gaord commented May 25, 2026

Summary

The TUI engine already emits EngineEvent::UserInputRequired and the Engine handle exposes submit_user_input / cancel_user_input, but the runtime API layer (used by external GUI clients like VSCode extensions) was missing the plumbing to propagate these events or accept responses. This meant request_user_input tool calls would hang indefinitely in GUI mode with no dialog appearing.

Changes

1. approval.rs — Timeout protection for await_user_input

  • Added a 5-minute timeout (USER_INPUT_TIMEOUT) around the rx_user_input.recv() call
  • On timeout, the engine emits a Status event and returns a ToolError
  • Prevents a disconnected GUI from stalling the agent loop forever

2. runtime_threads.rs — Event forwarding + manager methods + interrupt fix

  • Event forwarding: Handle EngineEvent::UserInputRequired in the event loop, emitting a "user_input.required" SSE event with the input ID and full request payload (questions, options)
  • Manager methods: Added submit_user_input() and cancel_user_input() on RuntimeThreadManager that delegate to the Engine handle
  • Interrupt fix: interrupt_turn() now immediately clears active_turn and emits "turn.completed" so the thread accepts new messages after /interrupt — this prevents persistent 409 "Thread already has an active turn" errors

3. runtime_api.rs — REST endpoint for user input submission

  • Added POST /v1/user-input/{thread_id}/{input_id} endpoint
  • Accepts { "answers": [{ "id": "...", "label": "...", "value": "..." }] }
  • Delivers responses to the engine via RuntimeThreadManager::submit_user_input()

Flow

Tool calls request_user_input
  → approval.rs sends Event::UserInputRequired
  → runtime_threads.rs forwards as SSE "user_input.required"
  → GUI displays dialog with option buttons
  → GUI calls POST /v1/user-input/{thread_id}/{input_id}
  → runtime_api.rs delivers to engine via channel
  → approval.rs receives answer, returns to tool

Testing

  • cargo check -p codewhale-tui passes
  • cargo clippy -p codewhale-tui --all-targets passes (no new warnings)
  • Verified end-to-end with VSCode extension (CodeWhale VSCode) that user input dialogs appear and responses are correctly delivered

Related

  • Mirrors the existing approval flow (approval.required SSE + POST /v1/approvals/{id}) that already works for tool authorization dialogs

Greptile Summary

This PR wires the existing request_user_input tool through the runtime API layer so external GUI clients can display dialogs and submit responses, mirroring the existing approval flow. It also adds a 5-minute timeout to await_user_input to prevent a disconnected client from stalling the agent loop.

  • approval.rs: Wraps rx_user_input.recv() with tokio::time::timeout; however, the timeout future is re-constructed inside the loop body, so any mismatched decision resets the 5-minute clock rather than counting against a single deadline.
  • runtime_api.rs: Adds POST /v1/user-input/{thread_id}/{input_id} for GUI response delivery, but unlike the approval endpoint it always returns 200 OK — there is no 404 when the input_id is stale or wrong, leaving callers unable to detect a failed delivery.
  • runtime_threads.rs: Forwards EngineEvent::UserInputRequired as an SSE event and adds submit_user_input/cancel_user_input on RuntimeThreadManager; the cancel path is currently dead code with no REST endpoint.

Confidence Score: 3/5

The core event-forwarding and SSE plumbing is straightforward, but two correctness issues in the new code paths could cause the timeout protection to be bypassed and leave GUI clients unable to detect failed deliveries.

The timeout in await_user_input is re-created on every loop iteration, so mismatched channel messages keep resetting the 5-minute guard — the disconnected-GUI protection the PR is meant to add can be effectively neutralized. The new REST endpoint also always responds with delivered: true regardless of whether the input_id matched a live request, making it impossible for a GUI to know it posted too late.

approval.rs (timeout logic) and runtime_api.rs (delivery confirmation) need the most attention before this is merged.

Important Files Changed

Filename Overview
crates/tui/src/core/engine/approval.rs Adds 5-minute timeout to await_user_input via tokio::time::timeout, but the timeout is re-created inside the loop, so any mismatched decision resets the clock and the guard can be bypassed indefinitely.
crates/tui/src/runtime_api.rs Adds POST /v1/user-input/{thread_id}/{input_id} endpoint; always returns 200 OK even for stale/wrong input_ids, diverging from the approval flow which returns 404 for unknown IDs.
crates/tui/src/runtime_threads.rs Adds submit_user_input/cancel_user_input manager methods and SSE forwarding for EngineEvent::UserInputRequired; cancel_user_input is dead code with no exposed endpoint.

Sequence Diagram

sequenceDiagram
    participant Tool as Tool (request_user_input)
    participant Engine as Engine (approval.rs)
    participant RTM as RuntimeThreadManager
    participant SSE as SSE Stream
    participant GUI as External GUI Client
    participant API as Runtime API

    Tool->>Engine: await_user_input(tool_id, request)
    Engine->>SSE: "Event::UserInputRequired { id, request }"
    RTM->>SSE: emit "user_input.required" SSE event
    SSE->>GUI: "{ id, request: { questions, options } }"

    GUI->>API: "POST /v1/user-input/{thread_id}/{input_id}"
    API->>RTM: submit_user_input(thread_id, input_id, response)
    RTM->>Engine: engine.submit_user_input(input_id, response)
    Engine->>Engine: rx_user_input.recv() matches tool_id
    Engine-->>Tool: Ok(UserInputResponse)
    API-->>GUI: "{ ok: true, delivered: true }"

    Note over Engine: If GUI disconnects, timeout fires after 5 min
    Engine->>SSE: "Event::Status { timed out }"
    Engine-->>Tool: Err(ToolError: timed out)
Loading

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (1): Last reviewed commit: "style: fix rustfmt formatting for user-i..." | Re-trigger Greptile

Greptile also left 4 inline comments on this PR.

Add SSE event forwarding for UserInputRequired, REST endpoint for submitting user input responses, timeout protection for await_user_input, and fix interrupt_turn to clear active_turn immediately.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a 300-second timeout for user input requests and adds a new API endpoint for submitting user input. It also updates the runtime thread manager to handle user input events and modifies turn interruption logic. Feedback suggests improving error handling consistency by using standard error mapping for missing threads and removing redundant turn finalization code that could cause duplicate events and data loss.

Comment thread crates/tui/src/runtime_api.rs Outdated
Comment thread crates/tui/src/runtime_threads.rs Outdated
Comment thread crates/tui/src/runtime_threads.rs Outdated
Comment thread crates/tui/src/runtime_threads.rs Outdated
gaord added 4 commits May 25, 2026 21:44
The monitor_turn loop already handles full turn finalization when the
engine shuts down after cancellation, including saving turn status,
usage, error, emitting turn.completed, and clearing active_turn.

Having interrupt_turn also save turn status and emit turn.completed
causes duplicate SSE events and loses usage/error data that
monitor_turn would have captured from TurnComplete.

Keep only the active_turn cleanup so the 409 error is resolved while
monitor_turn remains the single source of truth for turn completion.
- Change 'not loaded' to 'not found' in submit_user_input and
  cancel_user_input so map_thread_err correctly maps to 404
- Use map_thread_err in submit_user_input API endpoint for
  consistent error response (404 for missing thread, 409 for
  conflict, etc.) instead of always returning 500
Clearing active_turn immediately breaks is_interrupt_requested detection
in monitor_turn, causing turn status to be Completed instead of Interrupted.

Let monitor_turn handle the cleanup after it detects the interrupt flag
and performs full finalization with correct status, usage, and error.
@gaord
Copy link
Copy Markdown
Contributor Author

gaord commented May 27, 2026

@Hmbown anything to do with this?

@gaord gaord force-pushed the fix/user-input-runtime-bridge branch from bceef27 to b656453 Compare May 27, 2026 07:20
Comment on lines +130 to 168
result = tokio::time::timeout(USER_INPUT_TIMEOUT, self.rx_user_input.recv()) => {
match result {
Ok(Some(decision)) => {
match decision {
UserInputDecision::Submitted { id, response } if id == tool_id => {
return Ok(response);
}
UserInputDecision::Cancelled { id } if id == tool_id => {
return Err(ToolError::execution_failed(
"User input cancelled".to_string(),
));
}
_ => continue,
}
}
UserInputDecision::Cancelled { id } if id == tool_id => {
Ok(None) => {
return Err(ToolError::execution_failed(
"User input cancelled".to_string(),
"User input channel closed".to_string(),
));
}
Err(_) => {
let _ = self
.tx_event
.send(Event::Status {
message: format!(
"User input timed out after {}s",
USER_INPUT_TIMEOUT.as_secs()
),
})
.await;
return Err(ToolError::execution_failed(
format!(
"User input timed out after {}s",
USER_INPUT_TIMEOUT.as_secs()
),
));
}
_ => continue,
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Timeout resets on every mismatched decision

tokio::time::timeout(USER_INPUT_TIMEOUT, ...) is constructed fresh on each loop iteration. When a UserInputDecision arrives for a different tool_id (the _ => continue branch), the loop restarts and the 5-minute clock resets. This means the total wait is unbounded: any stream of unrelated decisions (e.g., a concurrent user-input tool call, or a stale channel message from a prior request) keeps deferring the timeout indefinitely, defeating the disconnected-GUI protection.

The fix is to start the sleep outside the loop so it counts down from a single point in time, then select on it as a separate arm.

Fix in Codex Fix in Claude Code Fix in Cursor

Comment on lines +1010 to +1036
async fn submit_user_input(
State(state): State<RuntimeApiState>,
Path((thread_id, input_id)): Path<(String, String)>,
Json(req): Json<SubmitUserInputBody>,
) -> Result<Json<SubmitUserInputResponse>, ApiError> {
use crate::tools::user_input::{UserInputAnswer, UserInputResponse};
let answers: Vec<UserInputAnswer> = req
.answers
.into_iter()
.map(|a| UserInputAnswer {
id: a.id,
label: a.label,
value: a.value,
})
.collect();
let response = UserInputResponse { answers };
let delivered = state
.runtime_threads
.submit_user_input(&thread_id, &input_id, response)
.await
.map_err(map_thread_err)?;
Ok(Json(SubmitUserInputResponse {
ok: true,
input_id,
delivered,
}))
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Stale or wrong input_id silently returns 200 OK

submit_user_input sends the decision onto the channel and always returns Ok(true). If a GUI POSTs with a wrong or already-expired input_id, the decision is silently discarded by the _ => continue branch in await_user_input — the caller gets {"ok":true,"delivered":true} even though no waiting tool received it.

This diverges from the approval flow, where deliver_external_approval checks whether a pending entry exists and returns false (surfaced as HTTP 404) when none is found. Without a similar pending-input registry, a GUI client has no way to detect that it POSTed too late or used the wrong ID, which can cause silent hangs on the GUI side.

Fix in Codex Fix in Claude Code Fix in Cursor

Comment on lines +850 to +851
#[allow(dead_code)]
pub async fn cancel_user_input(&self, thread_id: &str, input_id: &str) -> Result<bool> {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 cancel_user_input is dead code with no REST endpoint, meaning GUIs have no way to signal that the user dismissed the input dialog. This leaves the engine waiting the full 5 minutes on a cancelled interaction. Consider either exposing a DELETE /v1/user-input/{thread_id}/{input_id} endpoint or removing the #[allow(dead_code)] suppression until the endpoint is wired up.

Suggested change
#[allow(dead_code)]
pub async fn cancel_user_input(&self, thread_id: &str, input_id: &str) -> Result<bool> {
pub async fn cancel_user_input(&self, thread_id: &str, input_id: &str) -> Result<bool> {

Fix in Codex Fix in Claude Code Fix in Cursor

Comment on lines +1031 to +1035
Ok(Json(SubmitUserInputResponse {
ok: true,
input_id,
delivered,
}))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The delivered field is structurally always true here — submit_user_input either returns Ok(true) or propagates an error (in which case the handler already returned early). This misleads API consumers who may interpret it as confirmation that the answer was actually consumed by a waiting tool.

Suggested change
Ok(Json(SubmitUserInputResponse {
ok: true,
input_id,
delivered,
}))
Ok(Json(SubmitUserInputResponse {
ok: true,
input_id,
delivered: true,
}))

Fix in Codex Fix in Claude Code Fix in Cursor

@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented May 27, 2026

Independent review (Devin):

Tested merge against current main (54151a4) — clean, no conflicts. #2256 (runtime_threads.rs touched) merges without conflict too; no structural risk there.

What ships: POST /v1/user-input/{thread_id}/{input_id} bridging external GUI clients to Engine::submit_user_input via RuntimeThreadManager. SSE carries the user_input.required event out; the new endpoint carries the response in. Auth inherits the existing require_runtime_token middleware — no new attack surface.

Issues to fix before merge:

  1. submit_user_input returns Ok(true) unconditionally — channel send succeeds even if input_id doesn't match any waiting tool call (engine silently drops mismatched IDs on _ => continue). delivered: true in the response would be misleading. Fix: have the engine return whether the ID was consumed, or document that delivered means channel delivery only.

  2. bail!("thread not found") uses substring matchingmap_thread_err maps strings containing "not found" to 404, which works, but this is the only spot in the file relying on string heuristics instead of typed error variants. Aligning with the ThreadError pattern used elsewhere would be cleaner (Gemini flagged this too).

  3. cancel_user_input is #[allow(dead_code)] with no route — either wire DELETE /v1/user-input/{thread_id}/{input_id} (symmetric, needed for timeout UX) or drop it until needed.

  4. No test coverage for the new endpoint — the existing harness already tests approval and SSE flows; a minimal test driving submit_user_input through the HTTP layer would catch the delivered semantics issue above.

v0.8.48 (#2256) is a workspace consolidation (zero behavioral changes) and merges cleanly alongside this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants