Agent Sandbox

Isolated AI agent sandboxes using Firecracker microVMs. Each agent gets a full Ubuntu XFCE desktop with Chromium, scoped network access, and its own tool suite — all inside a hardware-isolated VM.

Why

AI agents execute arbitrary code. Containers share the host kernel. This project uses Firecracker (KVM) to give each agent its own kernel, its own filesystem, and network access restricted to only the domains it needs. If an agent gets prompt-injected or installs a compromised package, it can't phone home, can't reach other agents, and can't touch the host.

How It Works

Host
├── Firecracker (one VM per agent, KVM isolation)
├── nftables + Squid (per-VM domain filtering via TLS SNI)
├── OTel collector (trace every LLM call)
├── noVNC (watch agent desktops in your browser)
│
├── VM: debugger       10.0.0.2  → sentry.io, github.com
├── VM: feature-dev    10.0.1.2  → github.com, npmjs.org
├── VM: devops         10.0.2.2  → github.com, terraform, k8s
├── VM: researcher     10.0.3.2  → hn, reddit, arxiv
└── VM: security       10.0.4.2  → nvd.nist.gov, github.com

Each agent is defined in a single YAML file with composable presets:

# config/agents/debugger.yaml
agent:
  type: debugger
  name: "Sentry Bug Investigator"

egress:
  presets: [github, google, stackoverflow]
  domains: [.sentry.io]

capabilities:
  presets: [debugging, python-dev]

prompt:
  role: |
    You are a senior debugging specialist...
  presets: [explore-tools, debugging-workflow, git-workflow, code-execution,
            browser-instructions, report-output]

Quick Start

# 1. Configure
cp config/sandbox.yaml.example config/sandbox.yaml
vim config/sandbox.yaml   # set llm.api_base, llm.api_key, network.host_iface

# 2. Setup (Firecracker, kernel, Squid, nftables, OTel, host hardening)
sudo bin/sandbox-ctl setup

# 3. Build
sudo bin/sandbox-ctl build-base    # Ubuntu + XFCE + Chrome + Python 3.12
sudo bin/sandbox-ctl build-all     # per-agent tool customization

# 4. Launch
sudo bin/sandbox-ctl launch debugger

# 5. Observe
bin/sandbox-ctl vnc debugger       # open desktop in browser
bin/sandbox-ctl status             # list all VMs
bin/sandbox-ctl ssh debugger       # SSH in (password: agent)

Agents

Five built-in agents. Create your own by adding a YAML file — see Creating Agents.

Agent	Role	Allowed Domains
debugger	Sentry traces → root cause analysis	sentry.io, github, stackoverflow
feature-dev	GitHub issues → pull requests	github, npm, pypi
devops	Deployments, feature flags, rollbacks	github, terraform, k8s, cloud
researcher	HN, Reddit, arxiv trend monitoring	news sites, arxiv, reddit
security	CVE scanning, dependency auditing	nvd.nist.gov, github, cve.org

Configuration

All config is YAML with composable presets:

config/
  sandbox.yaml                 # LLM endpoint, network, VM defaults
  agents/*.yaml                # one per agent type
  presets/
    egress/*.yaml              # domain groups (github, npm, pypi, ...)
    capabilities/*.yaml        # tool groups (python-dev, debugging, ...)
    prompts/*.yaml             # rulebook presets (git-workflow, ...)
  install-scripts/*.sh         # complex tool installers
  secrets/github-tokens/       # fine-grained PATs (gitignored)

bin/sandbox-ctl config list-presets   # browse available presets
bin/sandbox-ctl config validate X    # check an agent YAML
bin/sandbox-ctl config compile       # YAML → flat build files

Security

Tested against real supply chain attacks (litellm .pth harvester, axios npm RAT) and 21 escape techniques across 7 categories. All exfiltration attempts blocked.

sudo bin/security-test.sh            # 34 tests, 7 attack categories
sudo bin/supply-chain-test.sh        # litellm + axios attack emulation
sudo bin/advanced-escape-test.sh     # domain fronting, DNS tunneling, ICMP, ...
sudo bin/novel-escape-test.sh        # IPv6 bypass, LLM exfil, GitHub C2
sudo bin/harden-host.sh audit        # Firecracker production compliance

See Security for the full threat model and docs/operations.md for troubleshooting.

GitHub Token Security

Agents use fine-grained personal access tokens scoped to specific repos and permissions. Classic tokens and SSH keys are rejected.

bin/setup-github-tokens.sh show       # see requirements per agent
bin/setup-github-tokens.sh            # interactive setup
bin/setup-github-tokens.sh validate   # check all tokens

CLI Reference

Setup:      setup, build-base, build-agent, build-all
VMs:        launch, stop, stop-all, status, cleanup
Access:     vnc, logs, ssh
Config:     config compile, config validate, config list-presets, config docs
Info:       list-agents, network-status, help
Testing:    integration-test.sh, security-test.sh, supply-chain-test.sh,
            advanced-escape-test.sh, novel-escape-test.sh, harden-host.sh

Documentation

Doc	Contents
Creating Agents	How to define custom agents with YAML + presets
Architecture	System design, config pipeline, network model
Operations	Running, monitoring, troubleshooting, base tools
Security	Threat model, defense layers, accepted risks
Presets Reference	All egress, capability, and prompt presets

Requirements

Linux x86_64 with KVM (/dev/kvm)
~60GB RAM for 5 VMs (configurable per-agent)
Python 3 + PyYAML on host

Host packages

bin/sandbox-ctl setup installs most dependencies, but the base-image build and host networking need these present first:

Tool	Debian/Ubuntu	Fedora/RHEL	Arch
debootstrap (build rootfs)	`debootstrap`	`debootstrap`	`debootstrap` + `ubuntu-keyring`
squid / dnsmasq / nftables	`squid dnsmasq nftables`	same	`squid dnsmasq nftables`
websockify (noVNC)	`python3-websockify`	`python3-websockify`	AUR, or a venv: `python -m venv /opt/novnc-venv && /opt/novnc-venv/bin/pip install websockify`
ssh client w/ password (testing)	`sshpass`	`sshpass`	`sshpass`

Also required on the host: rsync, mkfs.ext4 (e2fsprogs), curl, openssl, jq.

firewalld coexistence

If the host runs firewalld (default on Fedora/RHEL, common on Arch), its input chain rejects VM→gateway traffic (DNS, Squid, OTel) before vm_filter's allow rules are evaluated — nftables enforces all tables. Stop firewalld from rejecting the VM subnet so the sandbox's own vm_filter table is the authority:

sudo firewall-cmd --permanent --zone=trusted --add-source=10.0.0.0/16
sudo firewall-cmd --reload

This does not expose the host: vm_filter's input chain (set up by setup-host-network.sh) is the real VM→host control — it permits a VM to reach only Squid (3128/3129), DNS (53) and OTel (4317/4318) on the gateway and drops everything else, so an agent inside a VM cannot reach host SSH or any other local service. Verify with, from inside a VM: echo > /dev/tcp/<gateway>/22 (should hang/fail) vs …/3129 (should connect).

jailer vs. raw firecracker

Launches use the Firecracker jailer by default (chroot + dropped privileges

cgroup v2). launch.sh stages the kernel, rootfs, and config into the per-VM chroot (/srv/jailer/firecracker/<id>/root/) with chroot-relative paths, so this works out of the box. For a quick dev launch without the jailer:

sudo env NO_JAILER=1 bin/sandbox-ctl launch <agent> --no-agent

Reboot persistence & optional hardening

bin/sandbox-ctl setup applies host networking at runtime; it does not survive a reboot on its own. Enable the bundled service so nftables/Squid/dnsmasq (and any running VMs) are restored on boot:

sudo cp bin/agent-sandbox.service /etc/systemd/system/
sudo systemctl daemon-reload && sudo systemctl enable agent-sandbox.service

harden-host.sh also flags SMT/hyperthreading as a side-channel risk for multi-tenant isolation. Disabling it persistently is a kernel-cmdline change (reboot required). With systemd-boot + kernel-install (BLS entries), add the options to /etc/kernel/cmdline so they survive kernel updates, then to the active boot entry. Tip: leave the -fallback entry unmodified so it remains a clean recovery path (SMT on, verbose):

# persist for future kernel-install regens
echo "$(cat /etc/kernel/cmdline) nosmt quiet loglevel=1" | sudo tee /etc/kernel/cmdline
# apply to the current main entry (not the fallback)
sudo sed -i '/^options/ s/$/ nosmt quiet loglevel=1/' \
  /efi/loader/entries/<machine-id>-<version>.conf

nosmt halves available vCPUs. It is defense-in-depth, not required for the sandbox or Docker to function.

Running Docker inside a sandbox

The docker capability installs Docker Engine + Compose v2. Docker + Compose run inside the VM (overlay2, cgroup v2). Networking depends on the guest kernel:

Use kernel/build-kernel.sh for full Docker networking. Docker ≥28's default bridge driver needs the iptables raw table (CONFIG_IP_NF_RAW), which the stock CI kernel (fetch-kernel.sh) omits. build-kernel.sh rebuilds the Firecracker guest kernel with IP_NF_RAW + NF_TABLES (+ the iptables-nft NFT_COMPAT) on top of the CI config, giving working docker0 bridge, port publishing, and container egress NAT — verified end-to-end. It builds with clang/LLVM (Arch's bleeding-edge gcc miscompiles 6.1) and bakes acpi=off + VIRTIO_MMIO_CMDLINE_DEVICES into the kernel (a from-source vanilla kernel can't parse Firecracker's ACPI tables — the stock CI kernel carries FC patches — so it discovers devices from the virtio_mmio.device= boot args instead). This is transparent to config-template.json, which stays compatible with both kernels.
On the stock CI kernel, the bridge fails ("can't initialize iptables table raw"); set "iptables": false in /etc/docker/daemon.json to run containers without bridge NAT/port-publishing (use --network host/none).
Per-VM HTTPS filtering is enforced in ssl_bump, not http_access. Squid peeks the ClientHello at step1; the splice/terminate decision happens at step2 where the SNI is reliably available. Putting the ssl::server_name allow rule in http_access (as earlier versions did) is racy — http_access runs before the peek completes on some connections, matches the destination IP, denies, and client-first bumps the connection, so allowlisted HTTPS hosts fail with TLS unknown CA. gen-acl.sh therefore emits ssl_bump splice vmN_src vmN_domains rules (included between peek step1 and terminate all).
Allowlists are de-duplicated. Squid rejects overlapping ssl::server_name entries (both .docker.com and production.cloudflare.docker.com): 6.x fails fatally, 7.x mis-matches the parent. gen-acl.sh drops covered entries.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Sandbox

Why

How It Works

Quick Start

Agents

Configuration

Security

GitHub Token Security

CLI Reference

Documentation

Requirements

Host packages

firewalld coexistence

jailer vs. raw firecracker

Reboot persistence & optional hardening

Running Docker inside a sandbox

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
bin		bin
config		config
docs		docs
kernel		kernel
lib		lib
network		network
novnc		novnc
otel		otel
rootfs		rootfs
state		state
vm		vm
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Agent Sandbox

Why

How It Works

Quick Start

Agents

Configuration

Security

GitHub Token Security

CLI Reference

Documentation

Requirements

Host packages

firewalld coexistence

jailer vs. raw firecracker

Reboot persistence & optional hardening

Running Docker inside a sandbox

About

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages