fix: detach spawned tsmd stdio so it can't hang an agent harness#17
Merged
Conversation
The CLI auto-spawns tsmd and set cmd.Stderr = os.Stderr. Because tsmd is long-lived, it kept a copy of the caller's stderr fd open for its entire life. Under an agent harness like Claude Code's Bash tool β which considers a command finished only when its stdout/stderr pipes reach EOF β that fd never closes, so the first tsm call that has to spawn the daemon hangs the harness forever, even though the tsm process itself exited. In a normal terminal the same code is harmless (a tty isn't waiting for EOF), which is why it only bites under agents. Fix spawn(): - Redirect the daemon's stdout+stderr to a log file (new paths.DaemonLog(), <dataDir>/tsmd.log), falling back to /dev/null β never the caller's stdio. - Start it in its own session (Setsid) so signals to the CLI's process group don't take the daemon down, and Release() to fully disown it. - Drop the stdout socket-path readback; --socket is passed explicitly, so readiness comes from waitForSocket() polling the socket for liveness. Adds a deterministic spawn test using a fake python tsmd that asserts the daemon's stderr lands in the log (not inherited) and that it runs in its own session.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bug
The CLI auto-spawns
tsmdand setcmd.Stderr = os.Stderr. Becausetsmdis long-lived, it kept a copy of the caller's stderr fd open for its entire life. Under an agent harness like Claude Code's Bash tool β which considers a command finished only when its stdout/stderr pipes reach EOF β that fd never closes, so the firsttsmcall that has to spawn the daemon hangs the harness forever, even though thetsmprocess itself already exited.In a normal terminal the same code is harmless (a tty isn't waiting for EOF), which is why it only bites under agents. This is a likely contributor to first-call credential-access friction for agents.
The fix (
spawn()ininternal/daemon/lifecycle.go)paths.DaemonLog()β<dataDir>/tsmd.logβ falling back to/dev/null. Never the caller's stdio.Setsid) so signals to the CLI's process group don't take the daemon down, andProcess.Release()to fully disown it.--socketis passed explicitly, so readiness comes fromwaitForSocket()polling the socket for liveness. (Bonus: removes a 10s-timeout failure path when the daemon is slow to print.)Testing
TestSpawn_RedirectsDaemonStdioToLogAndDetaches: a fake pythontsmdbinds the socket and writes a marker to stderr; the test asserts the marker lands in the log file (proving stderr is redirected, not inherited) and that the daemon runs in its own session (getpgid(pid) == pid). Red before the fix (stderr leaked to the test's own stderr; 10s stdout-readback timeout), green after.paths.DaemonLog()tests (default +XDG_DATA_HOME).gofmt, andgo vetclean.tsm statusspawned the daemon and returned in ~1s (previously hung), daemon detached and alive,~/.local/share/tsm/tsmd.logcreated.Notes
waitForSockettiming out, with the reason captured intsmd.log.