Skip to content

fix: reduce container timeout to 30 min and fix orphaned containers on timeout#86

Merged
eshulman2 merged 1 commit into
mainfrom
fix/container-timeout
Jun 18, 2026
Merged

fix: reduce container timeout to 30 min and fix orphaned containers on timeout#86
eshulman2 merged 1 commit into
mainfrom
fix/container-timeout

Conversation

@eshulman2

Copy link
Copy Markdown
Collaborator

Summary

  • Root cause: Settings.container_timeout defaulted to 7200 s (2 hours). The SandboxConfig class with 1800 s was dead code — never imported or used.
  • Secondary bug: when asyncio.wait_for fired a TimeoutError, the handler called process.kill() which only killed the podman run client process, leaving the container running orphaned in the Podman daemon.

Changes

  • Set container_timeout default to 1800 s (30 minutes) in Settings and ContainerConfig — still overridable via FORGE_SANDBOX_TIMEOUT
  • Extract _stop_timed_out_container() helper: podman stop → escalate to podman kill on non-zero exit or wait timeout → wait for the podman run process to exit with a force-kill fallback
  • Use the helper for both TimeoutError and CancelledError paths (previously CancelledError had the correct logic inline; now both paths share it)

Test plan

  • Unit tests pass (uv run pytest tests/unit/ -q)
  • Verify a long-running container is killed after ~30 minutes
  • Verify no orphaned containers remain after a timeout

🤖 Generated with Claude Code

…aned containers on timeout

The effective container timeout was 7200 s (2 hours) despite SandboxConfig
advertising 1800 s — that class was dead code. Containers that hit the
Python-level asyncio.wait_for TimeoutError also leaked: process.kill()
only terminated the podman run client, leaving the container running in
the Podman daemon.

- Set Settings.container_timeout and ContainerConfig.timeout_seconds to
  1800 s (30 minutes) — overridable via FORGE_SANDBOX_TIMEOUT env var
- Extract _stop_timed_out_container() and use it for both TimeoutError
  and CancelledError paths: podman stop → podman kill on non-zero exit
  or wait timeout → process.wait() with force-kill fallback

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@eshulman2 eshulman2 force-pushed the fix/container-timeout branch from 37c515a to e139e7d Compare June 18, 2026 09:40
@eshulman2 eshulman2 merged commit 1bc8d69 into main Jun 18, 2026
2 checks passed
@eshulman2 eshulman2 deleted the fix/container-timeout branch June 18, 2026 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant