#250: barrier background-worker teardown against ENOTEMPTY flake#255
Conversation
…250) claude-companion smoke intermittently failed with ENOTEMPTY during temp cleanup: a detached --background worker was still writing into dataDir/state/.../jobs while the test's finally rmSync'd dataDir. Tests awaited only the job record (waitForJobRecord), never the worker process exit. Apply the deterministic waitForProcessExit barrier (mirroring #234) to every background-worker test that rmSyncs dataDir: capture launched.pid, then await waitForProcessExit(launchedPid) as the first finally statement, before cleanup. Cancellation tests use the tolerant .catch(() => {}). 9 tests covered; the waitForProcessExit helper and the one already-barriered test are unchanged. Validated: lint clean, baseline 138/138, 16x sequential stress 0 ENOTEMPTY (the flake was ~50%/run before the fix). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request updates several smoke tests in claude-companion.smoke.test.mjs to track the process ID (launchedPid) of spawned background processes and await their exit (waitForProcessExit) in the finally block before performing cleanup. This prevents race conditions and resource leaks during test execution. There are no review comments, and I have no additional feedback to provide.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
…nd to gemini (#250) The first cut of #250 used a finally-first barrier with .catch() on the cancel tests -- a variant that diverged from the proven #234/#242 pattern and swallowed the 5s fail-loud timeout. Rework to the exact #242 form: end-of-try placement, plain await (fail-loud), a 30s *_SMOKE_POLL_TIMEOUT_MS budget on the cancel/status-poll tests, and the redundant setTimeout(250) cushions replaced rather than duplicated. Extend the same barrier to the gemini companion smoke suite, which shares the identical detached-worker teardown race. The resulting barriers are byte-identical to the companion-smoke barriers currently carried by PR #242 (verified: git diff feat/234 shows zero changed waitForProcessExit lines), so #242 dedups them automatically on its next main merge. This makes #250 the single source for the companion-smoke barrier class and removes the duplicative barrier work from the concurrent-relays PR. Class coverage: claude (9 tests) and gemini (helper + 8 tests) fixed here; kimi already barriered on main; agy is foreground-only (rejects --background, no detached worker, no race); identity-resume has no background launches. Test-only. Validated: lint (incl. sync checks), claude 138/138, gemini 100/100, sequential stress 10x claude + 5x gemini = 0 ENOTEMPTY (flake reproduced ~50%/run unfixed). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Reworked at
Class coverage (MECE): claude + gemini fixed here; kimi already barriered on Validation: lint (incl. all sync checks), claude 138/138, gemini 100/100, sequential stress 10× claude + 5× gemini = 0 ENOTEMPTY (flake reproduced ~50%/run unfixed). |
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|



What
Fixes the intermittent
ENOTEMPTYteardown flake in the claude and gemini companion smoke suites (#250) by applying the provenwaitForProcessExitworker-exit barrier to every background-worker test that recursively removes its data dir.Root cause
Background-worker tests launch a detached
--backgroundworker, then await only the job record (waitForJobRecord/ terminal-meta poll) — never the worker process exit. Thefinallythen recursively removesdataDirwhile the worker is still doing its post-terminal tail writes (lease release, sidecar removal, lifecycle-jsonl flush), so cleanup races a live writer and flakes withENOTEMPTYonrmdirofstate/.../jobs(sometimes the top-level data dir).This is the same class #234 fixed for the concurrent-relay harnesses with a deterministic
waitForProcessExitbarrier; the companion smoke suites lacked the equivalent.Fix
Apply the in-file
waitForProcessExit(pid[, timeoutMs])barrier as the last statement of each affected test'stry, before cleanup — using the exact proven pattern from PR #242 / the #234 lineage (plainawait, fail-loud; a 30s*_SMOKE_POLL_TIMEOUT_MSbudget on the cancel / status-poll tests; the redundantsetTimeout(250)cushions replaced, not duplicated):waitForProcessExithelper added, plus 8 background-worker tests barriered (4 cancel/status-poll tests use the 30s budget; the approval test waits onlaunchEvent.pid).Test-only; no product/source code touched.
Deterministic by construction: cleanup now provably runs after the worker has exited (
process.kill(pid,0)→ESRCH), so an ENOTEMPTY from a still-writing worker is structurally impossible.Single-source / #242 reconciliation
These barriers are byte-identical to the companion-smoke barriers currently carried by the open PR #242 (concurrent relays) — verified:
git diff feat/234 -- <both smoke files>shows zero changedwaitForProcessExitlines. #242 carried the barrier work bundled with its feature changes; this PR extracts it into the focused flake-fix where it belongs. Once this merges, #242's nextgit merge maincollapses the identical barrier lines automatically, leaving #242 with only its concurrency-feature changes. No duplicative barrier work survives in either PR.Verification (local)
npm run lintclean (incl. all sync checks).node --testof both suites: claude + gemini green.Closes #250
🤖 Generated with Claude Code