Skip to content

fix: detach spawned tsmd stdio so it can't hang an agent harness#17

Merged
tashian merged 1 commit into
mainfrom
fix/daemon-spawn-detach-stdio
May 31, 2026
Merged

fix: detach spawned tsmd stdio so it can't hang an agent harness#17
tashian merged 1 commit into
mainfrom
fix/daemon-spawn-detach-stdio

Conversation

@tashian

@tashian tashian commented May 31, 2026

Copy link
Copy Markdown
Owner

The bug

The CLI auto-spawns tsmd and set cmd.Stderr = os.Stderr. Because tsmd is long-lived, it kept a copy of the caller's stderr fd open for its entire life. Under an agent harness like Claude Code's Bash tool β€” which considers a command finished only when its stdout/stderr pipes reach EOF β€” that fd never closes, so the first tsm call that has to spawn the daemon hangs the harness forever, even though the tsm process itself already exited.

In a normal terminal the same code is harmless (a tty isn't waiting for EOF), which is why it only bites under agents. This is a likely contributor to first-call credential-access friction for agents.

The fix (spawn() in internal/daemon/lifecycle.go)

  • Redirect the daemon's stdout+stderr to a log file β€” new paths.DaemonLog() β†’ <dataDir>/tsmd.log β€” falling back to /dev/null. Never the caller's stdio.
  • Start it in its own session (Setsid) so signals to the CLI's process group don't take the daemon down, and Process.Release() to fully disown it.
  • Drop the stdout socket-path readback. --socket is passed explicitly, so readiness comes from waitForSocket() polling the socket for liveness. (Bonus: removes a 10s-timeout failure path when the daemon is slow to print.)

Testing

  • New deterministic TestSpawn_RedirectsDaemonStdioToLogAndDetaches: a fake python tsmd binds the socket and writes a marker to stderr; the test asserts the marker lands in the log file (proving stderr is redirected, not inherited) and that the daemon runs in its own session (getpgid(pid) == pid). Red before the fix (stderr leaked to the test's own stderr; 10s stdout-readback timeout), green after.
  • Added paths.DaemonLog() tests (default + XDG_DATA_HOME).
  • Full Go suite, gofmt, and go vet clean.
  • Live: with no daemon running, a cold tsm status spawned the daemon and returned in ~1s (previously hung), daemon detached and alive, ~/.local/share/tsm/tsmd.log created.

Notes

The CLI auto-spawns tsmd and set cmd.Stderr = os.Stderr. Because tsmd is
long-lived, it kept a copy of the caller's stderr fd open for its entire life.
Under an agent harness like Claude Code's Bash tool β€” which considers a command
finished only when its stdout/stderr pipes reach EOF β€” that fd never closes, so
the first tsm call that has to spawn the daemon hangs the harness forever, even
though the tsm process itself exited. In a normal terminal the same code is
harmless (a tty isn't waiting for EOF), which is why it only bites under agents.

Fix spawn():
- Redirect the daemon's stdout+stderr to a log file (new paths.DaemonLog(),
  <dataDir>/tsmd.log), falling back to /dev/null β€” never the caller's stdio.
- Start it in its own session (Setsid) so signals to the CLI's process group
  don't take the daemon down, and Release() to fully disown it.
- Drop the stdout socket-path readback; --socket is passed explicitly, so
  readiness comes from waitForSocket() polling the socket for liveness.

Adds a deterministic spawn test using a fake python tsmd that asserts the
daemon's stderr lands in the log (not inherited) and that it runs in its own
session.
@tashian tashian merged commit 64ab184 into main May 31, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant