Isolated AI agent sandboxes using Firecracker microVMs. Each agent gets a full Ubuntu XFCE desktop with Chromium, scoped network access, and its own tool suite — all inside a hardware-isolated VM.
AI agents execute arbitrary code. Containers share the host kernel. This project uses Firecracker (KVM) to give each agent its own kernel, its own filesystem, and network access restricted to only the domains it needs. If an agent gets prompt-injected or installs a compromised package, it can't phone home, can't reach other agents, and can't touch the host.
Host
├── Firecracker (one VM per agent, KVM isolation)
├── nftables + Squid (per-VM domain filtering via TLS SNI)
├── OTel collector (trace every LLM call)
├── noVNC (watch agent desktops in your browser)
│
├── VM: debugger 10.0.0.2 → sentry.io, github.com
├── VM: feature-dev 10.0.1.2 → github.com, npmjs.org
├── VM: devops 10.0.2.2 → github.com, terraform, k8s
├── VM: researcher 10.0.3.2 → hn, reddit, arxiv
└── VM: security 10.0.4.2 → nvd.nist.gov, github.com
Each agent is defined in a single YAML file with composable presets:
# config/agents/debugger.yaml
agent:
type: debugger
name: "Sentry Bug Investigator"
egress:
presets: [github, google, stackoverflow]
domains: [.sentry.io]
capabilities:
presets: [debugging, python-dev]
prompt:
role: |
You are a senior debugging specialist...
presets: [explore-tools, debugging-workflow, git-workflow, code-execution,
browser-instructions, report-output]# 1. Configure
cp config/sandbox.yaml.example config/sandbox.yaml
vim config/sandbox.yaml # set llm.api_base, llm.api_key, network.host_iface
# 2. Setup (Firecracker, kernel, Squid, nftables, OTel, host hardening)
sudo bin/sandbox-ctl setup
# 3. Build
sudo bin/sandbox-ctl build-base # Ubuntu + XFCE + Chrome + Python 3.12
sudo bin/sandbox-ctl build-all # per-agent tool customization
# 4. Launch
sudo bin/sandbox-ctl launch debugger
# 5. Observe
bin/sandbox-ctl vnc debugger # open desktop in browser
bin/sandbox-ctl status # list all VMs
bin/sandbox-ctl ssh debugger # SSH in (password: agent)Five built-in agents. Create your own by adding a YAML file — see Creating Agents.
| Agent | Role | Allowed Domains |
|---|---|---|
| debugger | Sentry traces → root cause analysis | sentry.io, github, stackoverflow |
| feature-dev | GitHub issues → pull requests | github, npm, pypi |
| devops | Deployments, feature flags, rollbacks | github, terraform, k8s, cloud |
| researcher | HN, Reddit, arxiv trend monitoring | news sites, arxiv, reddit |
| security | CVE scanning, dependency auditing | nvd.nist.gov, github, cve.org |
All config is YAML with composable presets:
config/
sandbox.yaml # LLM endpoint, network, VM defaults
agents/*.yaml # one per agent type
presets/
egress/*.yaml # domain groups (github, npm, pypi, ...)
capabilities/*.yaml # tool groups (python-dev, debugging, ...)
prompts/*.yaml # rulebook presets (git-workflow, ...)
install-scripts/*.sh # complex tool installers
secrets/github-tokens/ # fine-grained PATs (gitignored)
bin/sandbox-ctl config list-presets # browse available presets
bin/sandbox-ctl config validate X # check an agent YAML
bin/sandbox-ctl config compile # YAML → flat build filesTested against real supply chain attacks (litellm .pth harvester, axios npm RAT) and 21 escape techniques across 7 categories. All exfiltration attempts blocked.
sudo bin/security-test.sh # 34 tests, 7 attack categories
sudo bin/supply-chain-test.sh # litellm + axios attack emulation
sudo bin/advanced-escape-test.sh # domain fronting, DNS tunneling, ICMP, ...
sudo bin/novel-escape-test.sh # IPv6 bypass, LLM exfil, GitHub C2
sudo bin/harden-host.sh audit # Firecracker production complianceSee Security for the full threat model and docs/operations.md for troubleshooting.
Agents use fine-grained personal access tokens scoped to specific repos and permissions. Classic tokens and SSH keys are rejected.
bin/setup-github-tokens.sh show # see requirements per agent
bin/setup-github-tokens.sh # interactive setup
bin/setup-github-tokens.sh validate # check all tokensSetup: setup, build-base, build-agent, build-all
VMs: launch, stop, stop-all, status, cleanup
Access: vnc, logs, ssh
Config: config compile, config validate, config list-presets, config docs
Info: list-agents, network-status, help
Testing: integration-test.sh, security-test.sh, supply-chain-test.sh,
advanced-escape-test.sh, novel-escape-test.sh, harden-host.sh
| Doc | Contents |
|---|---|
| Creating Agents | How to define custom agents with YAML + presets |
| Architecture | System design, config pipeline, network model |
| Operations | Running, monitoring, troubleshooting, base tools |
| Security | Threat model, defense layers, accepted risks |
| Presets Reference | All egress, capability, and prompt presets |
- Linux x86_64 with KVM (
/dev/kvm) - ~60GB RAM for 5 VMs (configurable per-agent)
- Python 3 + PyYAML on host
bin/sandbox-ctl setup installs most dependencies, but the base-image build and
host networking need these present first:
| Tool | Debian/Ubuntu | Fedora/RHEL | Arch |
|---|---|---|---|
| debootstrap (build rootfs) | debootstrap |
debootstrap |
debootstrap + ubuntu-keyring |
| squid / dnsmasq / nftables | squid dnsmasq nftables |
same | squid dnsmasq nftables |
| websockify (noVNC) | python3-websockify |
python3-websockify |
AUR, or a venv: python -m venv /opt/novnc-venv && /opt/novnc-venv/bin/pip install websockify |
| ssh client w/ password (testing) | sshpass |
sshpass |
sshpass |
Also required on the host: rsync, mkfs.ext4 (e2fsprogs), curl, openssl, jq.
If the host runs firewalld (default on Fedora/RHEL, common on Arch), its
input chain rejects VM→gateway traffic (DNS, Squid, OTel) before vm_filter's
allow rules are evaluated — nftables enforces all tables. Stop firewalld from
rejecting the VM subnet so the sandbox's own vm_filter table is the authority:
sudo firewall-cmd --permanent --zone=trusted --add-source=10.0.0.0/16
sudo firewall-cmd --reloadThis does not expose the host: vm_filter's input chain (set up by
setup-host-network.sh) is the real VM→host control — it permits a VM to reach
only Squid (3128/3129), DNS (53) and OTel (4317/4318) on the gateway and
drops everything else, so an agent inside a VM cannot reach host SSH or any
other local service. Verify with, from inside a VM:
echo > /dev/tcp/<gateway>/22 (should hang/fail) vs …/3129 (should connect).
Launches use the Firecracker jailer by default (chroot + dropped privileges
- cgroup v2).
launch.shstages the kernel, rootfs, and config into the per-VM chroot (/srv/jailer/firecracker/<id>/root/) with chroot-relative paths, so this works out of the box. For a quick dev launch without the jailer:
sudo env NO_JAILER=1 bin/sandbox-ctl launch <agent> --no-agentbin/sandbox-ctl setup applies host networking at runtime; it does not
survive a reboot on its own. Enable the bundled service so nftables/Squid/dnsmasq
(and any running VMs) are restored on boot:
sudo cp bin/agent-sandbox.service /etc/systemd/system/
sudo systemctl daemon-reload && sudo systemctl enable agent-sandbox.serviceharden-host.sh also flags SMT/hyperthreading as a side-channel risk for
multi-tenant isolation. Disabling it persistently is a kernel-cmdline change
(reboot required). With systemd-boot + kernel-install (BLS entries), add the
options to /etc/kernel/cmdline so they survive kernel updates, then to the
active boot entry. Tip: leave the -fallback entry unmodified so it remains
a clean recovery path (SMT on, verbose):
# persist for future kernel-install regens
echo "$(cat /etc/kernel/cmdline) nosmt quiet loglevel=1" | sudo tee /etc/kernel/cmdline
# apply to the current main entry (not the fallback)
sudo sed -i '/^options/ s/$/ nosmt quiet loglevel=1/' \
/efi/loader/entries/<machine-id>-<version>.conf
nosmthalves available vCPUs. It is defense-in-depth, not required for the sandbox or Docker to function.
The docker capability installs Docker Engine + Compose v2. Docker + Compose run
inside the VM (overlay2, cgroup v2). Networking depends on the guest kernel:
- Use
kernel/build-kernel.shfor full Docker networking. Docker ≥28's default bridge driver needs the iptablesrawtable (CONFIG_IP_NF_RAW), which the stock CI kernel (fetch-kernel.sh) omits.build-kernel.shrebuilds the Firecracker guest kernel withIP_NF_RAW+NF_TABLES(+ the iptables-nftNFT_COMPAT) on top of the CI config, giving workingdocker0bridge, port publishing, and container egress NAT — verified end-to-end. It builds with clang/LLVM (Arch's bleeding-edge gcc miscompiles 6.1) and bakesacpi=off+VIRTIO_MMIO_CMDLINE_DEVICESinto the kernel (a from-source vanilla kernel can't parse Firecracker's ACPI tables — the stock CI kernel carries FC patches — so it discovers devices from thevirtio_mmio.device=boot args instead). This is transparent toconfig-template.json, which stays compatible with both kernels. - On the stock CI kernel, the bridge fails ("can't initialize iptables table
raw"); set"iptables": falsein/etc/docker/daemon.jsonto run containers without bridge NAT/port-publishing (use--network host/none). - Per-VM HTTPS filtering is enforced in
ssl_bump, nothttp_access. Squid peeks the ClientHello at step1; the splice/terminate decision happens at step2 where the SNI is reliably available. Putting thessl::server_nameallow rule inhttp_access(as earlier versions did) is racy —http_accessruns before the peek completes on some connections, matches the destination IP, denies, and client-first bumps the connection, so allowlisted HTTPS hosts fail with TLSunknown CA.gen-acl.shtherefore emitsssl_bump splice vmN_src vmN_domainsrules (included betweenpeek step1andterminate all). - Allowlists are de-duplicated. Squid rejects overlapping
ssl::server_nameentries (both.docker.comandproduction.cloudflare.docker.com): 6.x fails fatally, 7.x mis-matches the parent.gen-acl.shdrops covered entries.