Skip to content

fix(llm): spawn_asi_update embed() call has no timeout — can block router turn indefinitely #4566

@bug-ops

Description

@bug-ops

Description

`RouterProvider::spawn_asi_update` spawns a fire-and-forget Tokio task that calls `router.embed(&response)` without any timeout. If the embedding provider stalls (slow network, overloaded Ollama instance), the background task never resolves. While it does not block the main turn (fire-and-forget), it leaks the task indefinitely and saturates the embedding provider under load.

Related: the same path in `embedding_guard.rs` (line 157) also spawns `(embed_fn)(&output).await` without timeout.

Reproduction Steps

  1. Configure ASI (`with_asi`) with an embedding provider that introduces artificial delay > 30s
  2. Send repeated turns to the router
  3. Observe that embedding tasks accumulate indefinitely in the background

Expected Behavior

The spawned embed call should be bounded by the router's `embed_timeout_ms` (default 5000 ms). On timeout, the ASI window is simply not updated — which is the current behavior on embed error anyway.

Actual Behavior

`router.embed(&response).await` in the spawned task has no timeout. A stalled provider causes unbounded task accumulation.

Affected Code

  • `crates/zeph-llm/src/router/mod.rs` line 1452: `router.embed(&response).await`
  • `crates/zeph-mcp/src/embedding_guard.rs` line 157: `(embed_fn)(&output).await`

Fix

Wrap both embed calls with `tokio::time::timeout(Duration::from_millis(router.embed_timeout_ms), ...)` (or the guard's configured timeout). On `Elapsed`, log at `debug` and return early — same behaviour as current embed error path.

Environment

  • Commit: HEAD main
  • Crates: zeph-llm, zeph-mcp

Metadata

Metadata

Assignees

Labels

P2High value, medium complexitybugSomething isn't workingllmzeph-llm crate (Ollama, Claude)

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions