Why
Today Tilth's only worker containment is cwd=workspace for the bash tool (tilth/tools/bash.py) plus a narrow denylist in pre_tool (force-push, sudo, curl|sh). _resolve keeps the file tools well-behaved, but bash runs with the full credentials of the user running uv run tilth — cat /etc/passwd, cat ../../.env, cat ../../sessions/<id>/events.jsonl all succeed. See discussion in #10.
This issue is the research artifact for an opt-in isolated mode: a secondary mode that wraps the harness in real process isolation. Default mode stays best-effort. No code changes proposed here.
Design preference (settled)
When isolated mode is requested for a session, pick the most-isolated option that's actually available on the host, in this order:
- Apple Containerization (
container CLI) — if Apple Silicon + macOS 26+ + container installed and container system start'd. Apache-2.0, sub-second microVM, native fit. Preferred when available.
- Docker Sandboxes (
sbx) — if sbx is installed and authenticated. Cross-platform fallback covering Intel Macs, Linux (KVM required), Windows.
- Default best-effort — what Tilth does today. Always available.
Default mode stays best-effort; isolated mode is opt-in (default behaviour unchanged for users who don't ask for it).
CLI shape deferred to a future implementation issue: probably --sandbox with auto-detect-and-announce (verbose log line saying which backend got picked), with explicit --sandbox=apple|sbx available as an override.
Integration shape
Two integration levels, with strong preference for the first regardless of backend:
- Wrap-the-harness (recommended). The whole
uv run tilth ~/projects/foo process runs inside the sandbox. Workspace + sessions/ are bind/passthrough mounted so --resume and --visualize from the host still see the same session dir. Likely zero-code, possibly one thin CLI affordance per backend.
- Wrap-the-tool-calls (not recommended). Harness on host, each
bash tool call routed through sandbox exec. Invasive — every tool needs sandbox awareness, file tools too. Loses the simplicity of "the worker is one process in a known cwd."
(1) is the natural fit given Tilth's process model.
Backend: Apple Containerization (container)
- License: Apache-2.0. Source at github.com/apple/container.
- Platform: Apple Silicon only. Requires macOS 26 (Tahoe) for full functionality; less capable on macOS 15.
- Tech: one lightweight Linux microVM per container (not a shared-kernel Docker model). Swift-based
vminitd per VM. Sub-second cold start.
- Image format: fully OCI-compatible. Any standard image from a standard registry works. Run a
tilth-sandbox image (preinstalled uv + Python) the same way you'd run any OCI image.
- Maturity: pre-1.0, active development, breaking changes possible between minor versions.
- Tilth fit: when available, this is the right choice — open source, native, fast, no Docker Inc. dependency.
Backend: Docker Sandboxes (sbx)
What it is
- microVM-per-sandbox (not a container). Linux uses KVM (
sudo usermod -aG kvm $USER); macOS uses the platform hypervisor (Apple Virtualization Framework, not explicit in docs); Windows uses Hyper-V.
- Released GA, v0.30.0 (May 2026) — sub-1.0, expect churn.
- Proprietary (Docker Inc.). github.com/docker/sbx-releases ships binaries only. Free for core functionality; team admin controls (network restrictions, FS policies) require sales engagement.
- Docker Desktop not required.
- Pitched specifically at coding-agent workloads. Recognised agents:
claude, codex, copilot, cursor, docker-agent, droid, gemini, kiro, opencode, shell.
Sources: product page, docs, usage, get started, architecture.
How isolation works
- Filesystem: workspace mounted via passthrough; absolute paths preserved between host and sandbox. Multiple workspaces via
sbx run/create AGENT PATH [PATH...], with :ro suffix for read-only.
- Network: all egress routed through an HTTP/HTTPS proxy on the host. Default-deny with three policy presets at first
sbx login; docs recommend Balanced. Allowlist additions via sbx policy allow network -g <host:port>. sbx policy also has named profiles for team governance.
- Secrets:
sbx secret set -g <service> stores in the OS keychain (service names like github, anthropic, openai); the host proxy injects into outbound API requests so the secret never appears inside the sandbox. Alternative: sbx exec -e KEY=VAL passes env vars directly, identical to docker exec.
- Lifecycle:
sbx create → sbx run (attach) → sbx stop (env persists) → sbx rm (destroy). sbx exec -it <name> bash to drop into an existing sandbox.
- Resource limits:
--cpus, -m/--memory (defaults to 50% host, max 32 GiB).
- Custom base image:
-t/--template <oci image> — agent-specific image by default, but a custom OCI image is supported. We could ship tilth-sandbox preinstalled with uv + Python.
- Other utilities:
sbx cp (file copy host↔sandbox), sbx ports (publish sandbox ports), sbx diagnose (debug install).
- Surprise overlap:
sbx run --branch creates a git worktree as part of the sandbox. Overlaps directly with Tilth's worktree machinery — if Tilth runs inside sbx, we'd ignore the flag and let Tilth manage its own worktrees as today.
Automation-friendly invocation
For a --sandbox mode driven from a script:
sbx create shell . --name tilth-session-<id> [--memory 8g --cpus 4]
sbx exec -e TILTH_API_KEY="$TILTH_API_KEY" \
-e TILTH_MODEL="$TILTH_MODEL" \
-w /workspace tilth-session-<id> \
uv run tilth ~/projects/tilth-demo
sbx rm tilth-session-<id> # or keep for resume
Frictions specific to sbx
- Proprietary — strategic dependency on a closed Docker Inc. product. Bounded since this is an opt-in second backend.
- Linux requires KVM — rules out most cloud VMs without nested virt and most CI runners. Dev-machine use on macOS / Windows / Linux desktop is the realistic target.
- Sub-1.0 maturity — CLI surface may churn.
Other options considered
sandbox-exec — built into macOS (no install), kernel-level sandbox profiles (Scheme/LISP syntax), zero VM overhead. Apple has technically deprecated it for app distribution but it remains the lowest-friction macOS option for sandboxing arbitrary CLI tools. There's an open issue against apple/container asking Apple to clarify the deprecation timeline. Profile authoring is painful. Could become a fourth-tier fallback for Intel Macs if we want to avoid the sbx dependency there; not pursuing in the first cut.
- Plain Docker / rootless podman — weaker isolation (shared kernel), but ubiquitous on Linux. Could be a Linux-only fallback ahead of sbx in the preference order if we want to keep all backends open source. Not pursuing in the first cut.
Comparison matrix
| Backend |
OS support |
License |
Maturity |
Isolation |
Overhead |
Apple container |
Apple Silicon + macOS 26+ only |
Apache-2.0 |
pre-1.0 |
microVM per container |
sub-second |
Docker sbx |
Mac / Win / Linux |
Proprietary |
v0.30 |
microVM |
VM boot |
sandbox-exec |
macOS only |
Built-in |
Stable but discouraged |
Kernel profile |
negligible |
| Plain Docker / rootless podman |
All / Linux |
OSS |
Stable |
Shared kernel / namespaces |
Low |
Open questions (verify by running)
- Default network policy on Balanced: does it include OpenRouter (
openrouter.ai)? OpenAI, Anthropic, Google almost certainly yes.
- Does
localhost:11434 from inside a sbx microVM reach the host's Ollama, or does it need a bridge (host.docker.internal-style)?
- Same network + filesystem questions for Apple
container, plus its default network policy.
- Cold microVM boot time on Mac (sbx, Apple
container) and Linux (sbx + KVM).
- License terms for sbx — anything restricting commercial use of the runtime itself.
Proposed next step
Install both sbx (already done) and Apple container on a dev machine, run the demo inside each (sbx run shell → uv run tilth ~/projects/tilth-demo, equivalent for container), document what breaks. Cheap probes that answer most of the open questions and confirm whether wrap-the-harness is genuinely zero-code per backend or wants a thin --sandbox affordance.
Why
Today Tilth's only worker containment is
cwd=workspacefor thebashtool (tilth/tools/bash.py) plus a narrow denylist inpre_tool(force-push, sudo,curl|sh)._resolvekeeps the file tools well-behaved, but bash runs with the full credentials of the user runninguv run tilth—cat /etc/passwd,cat ../../.env,cat ../../sessions/<id>/events.jsonlall succeed. See discussion in #10.This issue is the research artifact for an opt-in isolated mode: a secondary mode that wraps the harness in real process isolation. Default mode stays best-effort. No code changes proposed here.
Design preference (settled)
When isolated mode is requested for a session, pick the most-isolated option that's actually available on the host, in this order:
containerCLI) — if Apple Silicon + macOS 26+ +containerinstalled andcontainer system start'd. Apache-2.0, sub-second microVM, native fit. Preferred when available.sbx) — ifsbxis installed and authenticated. Cross-platform fallback covering Intel Macs, Linux (KVM required), Windows.Default mode stays best-effort; isolated mode is opt-in (default behaviour unchanged for users who don't ask for it).
CLI shape deferred to a future implementation issue: probably
--sandboxwith auto-detect-and-announce (verbose log line saying which backend got picked), with explicit--sandbox=apple|sbxavailable as an override.Integration shape
Two integration levels, with strong preference for the first regardless of backend:
uv run tilth ~/projects/fooprocess runs inside the sandbox. Workspace +sessions/are bind/passthrough mounted so--resumeand--visualizefrom the host still see the same session dir. Likely zero-code, possibly one thin CLI affordance per backend.bashtool call routed throughsandbox exec. Invasive — every tool needs sandbox awareness, file tools too. Loses the simplicity of "the worker is one process in a known cwd."(1) is the natural fit given Tilth's process model.
Backend: Apple Containerization (
container)vminitdper VM. Sub-second cold start.tilth-sandboximage (preinstalleduv+ Python) the same way you'd run any OCI image.Backend: Docker Sandboxes (
sbx)What it is
sudo usermod -aG kvm $USER); macOS uses the platform hypervisor (Apple Virtualization Framework, not explicit in docs); Windows uses Hyper-V.claude, codex, copilot, cursor, docker-agent, droid, gemini, kiro, opencode, shell.Sources: product page, docs, usage, get started, architecture.
How isolation works
sbx run/create AGENT PATH [PATH...], with:rosuffix for read-only.sbx login; docs recommend Balanced. Allowlist additions viasbx policy allow network -g <host:port>.sbx policyalso has namedprofiles for team governance.sbx secret set -g <service>stores in the OS keychain (service names likegithub,anthropic,openai); the host proxy injects into outbound API requests so the secret never appears inside the sandbox. Alternative:sbx exec -e KEY=VALpasses env vars directly, identical todocker exec.sbx create→sbx run(attach) →sbx stop(env persists) →sbx rm(destroy).sbx exec -it <name> bashto drop into an existing sandbox.--cpus,-m/--memory(defaults to 50% host, max 32 GiB).-t/--template <oci image>— agent-specific image by default, but a custom OCI image is supported. We could shiptilth-sandboxpreinstalled withuv+ Python.sbx cp(file copy host↔sandbox),sbx ports(publish sandbox ports),sbx diagnose(debug install).sbx run --branchcreates a git worktree as part of the sandbox. Overlaps directly with Tilth's worktree machinery — if Tilth runs inside sbx, we'd ignore the flag and let Tilth manage its own worktrees as today.Automation-friendly invocation
For a
--sandboxmode driven from a script:Frictions specific to sbx
Other options considered
sandbox-exec— built into macOS (no install), kernel-level sandbox profiles (Scheme/LISP syntax), zero VM overhead. Apple has technically deprecated it for app distribution but it remains the lowest-friction macOS option for sandboxing arbitrary CLI tools. There's an open issue againstapple/containerasking Apple to clarify the deprecation timeline. Profile authoring is painful. Could become a fourth-tier fallback for Intel Macs if we want to avoid the sbx dependency there; not pursuing in the first cut.Comparison matrix
containersbxsandbox-execOpen questions (verify by running)
openrouter.ai)? OpenAI, Anthropic, Google almost certainly yes.localhost:11434from inside a sbx microVM reach the host's Ollama, or does it need a bridge (host.docker.internal-style)?container, plus its default network policy.container) and Linux (sbx + KVM).Proposed next step
Install both
sbx(already done) and Applecontaineron a dev machine, run the demo inside each (sbx run shell→uv run tilth ~/projects/tilth-demo, equivalent forcontainer), document what breaks. Cheap probes that answer most of the open questions and confirm whether wrap-the-harness is genuinely zero-code per backend or wants a thin--sandboxaffordance.