AI coding agents have full access to your filesystem.
agentbox changes that.
Installation · Agents · Usage · VS Code · Security · Config
🌍 Deutsch
AI coding agents solve problems — and wreck your system doing it:
- They grind your machine to a halt — eating RAM and CPU without limits
- They trash your OS — caches, leftovers, until Windows won't boot clean anymore
- They steal your secrets — SSH keys,
.envfiles, passwords - They sit exposed on the network — your host and LAN are reachable
- They forget everything — once the session ends, the context is gone
- They keep pestering you for confirmation — because your system is on the line
Not in a portable sandbox.
One command. Clean environment. Full control. Fully portable. Windows native. No Docker. No Kubernetes.
agentbox runs AI coding agents in disposable WSL2 distributions with real filesystem and network isolation — giving you the productivity of AI agents without the risk.
Hopping laptops shouldn't mean rebuilding your entire AI dev setup. agentbox is designed around that:
- One PowerShell line installs everything — on any fresh Windows box, in under two minutes. No image to ship, no container registry to pull from.
- Your projects live in OneDrive (or Dropbox, or whatever cloud sync you already use). The
_control/folder is versioned and syncs by default, so your config, agent seeds, and project code follow you. - Sessions are disposable by design — the whole point. Nothing to migrate, no state to drag along.
- New machine = one line + one OAuth login per agent. That's it. Keep coding.
Lose the laptop? Buy a new one, run one command, log in. Your work is already there.
agentbox isn't just safer — it's faster. Example run on a modern laptop SSD under Windows 11 + WSL 2.x (2026-04-18):
| Metric | agentbox vs Host |
|---|---|
| Network download | 1.1x |
| Disk sequential write (1 GB) | 18.7x |
| Disk small files (10k x 4 B) | 9.1x |
| CPU SHA256 (500 MB) | 1.9x |
| Process spawn (500 procs) | 17.3x |
The big wins come from the ext4-on-vhdx workspace overlays (node_modules, .next, __pycache__ etc.) and Linux-native fork/exec — the same reason npm install and pytest feel snappier in WSL than on the Windows host.
Honest footnote: these ratios are measured against the persistent host distro (agentbox_host), without any session-time tuning. The ephemeral agent session layers BBR, dnsmasq caching, force-unsafe-io dpkg and additional ext4 overlays on top, so actual in-session numbers are typically higher.
Reproduce on your own hardware via the bundled demo project: agentbox → [c] Konfiguration → [3] Benchmark ausfuehren — code lives in tools/.
| Agent | Default | Install | Activate |
|---|---|---|---|
| Claude Code (Anthropic) | Enabled | npm | — |
| OpenAI Codex (OpenAI) | Enabled | npm | — |
| Gemini CLI (Google) | Enabled | npm | — |
| Aider | Disabled | pip | Set agent_aider_enabled to true in config.json |
| Goose (Block) | Disabled | pip | Set agent_goose_enabled to true in config.json |
Enable additional agents: at the agent-selection menu, press [c] (labelled Konfiguration) and toggle the agent you want — this writes to config.json directly. Then run install.ps1 once more in an admin PowerShell so the template gets rebuilt with the new agent binaries (irm https://raw.githubusercontent.com/ChrisRudi/agentbox/main/install.ps1 | iex).
One command in an admin PowerShell:
irm https://raw.githubusercontent.com/ChrisRudi/agentbox/main/install.ps1 | iexThat's it. Open a WSL terminal — agentbox starts automatically.
Same command. If agentbox is already installed, it pulls the latest version and rebuilds the template (including newly enabled agents).
What happens during installation?
- Repository is cloned to
AI_Projects_Source\_control(or your custom path) - WSL2 template is built (Ubuntu Minimal + Node.js + Python3 + enabled AI CLIs)
- Windows Event Source and Scheduled Task are created
- WSL
.bashrcis configured (auto-start) - You're asked once: Windows Terminal / VS Code / both? (see VS Code Integration)
- Desktop shortcut
agentbox.lnkis created (plusagentbox (VS Code).lnkif you picked VS Code) .wslconfigwith resource limits is set (configurable viaconfig.json)
Duration: approx. 3–5 minutes, one-time only. Updates are faster.
By default, agentbox uses OneDrive\AI_Projects_Source\. You can use any folder instead:
| Storage | How to configure |
|---|---|
| OneDrive (default) | Works out of the box |
| Google Drive | Set base_path_override in config.json to your Google Drive path |
| Dropbox | Set base_path_override in config.json to your Dropbox path |
| Local folder | Set base_path_override to any path, e.g. D:\Dev\AgentProjects |
Example in config.json:
"base_path_override": "D:\\GoogleDrive\\AI_Projects"Create a folder in your projects directory — agentbox auto-detects the type on first start:
AI_Projects_Source\
+-- MyNewApp\
+-- src\
+-- index.js ← agentbox detects "node"
A project.json is generated automatically. You can also create it manually:
{
"name": "MyNewApp",
"type": "node",
"version": "1.0.0",
"build": { "command": "npm run build", "output_dir": "build_out" },
"deploy": { "target": "", "url": "" },
"agent": { "working_dir": "src", "entry_point": "index.js" }
}Move or copy your project folder into AI_Projects_Source\:
# PowerShell — copy existing project
Copy-Item -Recurse "D:\Dev\my-existing-app" "$env:OneDrive\AI_Projects_Source\my-existing-app"agentbox expects this structure (only src/ is required):
my-existing-app\
+-- src\ ← your code (read-write in sandbox)
+-- assets\ ← static files (read-only in sandbox, optional)
If your project has no src/ folder, the project root is mounted as src/ instead.
| Field | Required | Description |
|---|---|---|
name |
Yes | Project name (matches folder name) |
type |
Yes | node, python, html, powershell, or generic |
version |
No | Semantic version (default: 1.0.0) |
build.command |
No | Must be on the build whitelist (see config.json) |
build.output_dir |
No | Build output directory (default: build_out) |
deploy.target |
No | local or github (must be on deploy whitelist) |
agent.working_dir |
No | Working directory inside project (default: src) |
agent.entry_point |
No | Main file (informational, for the agent) |
Auto-detected types and their defaults:
| Files found | Detected type | Default build command |
|---|---|---|
package.json |
node |
npm run build |
*.py |
python |
pip install -r requirements.txt |
*.ps1 |
powershell |
powershell -File build.ps1 |
*.html |
html |
— |
| (none of the above) | generic |
— |
Open a WSL terminal (or double-click the desktop shortcut):
Start agentbox? [Y/n] (auto in 5s)
=== agentbox ===
Which project?
[1] MyProject (recent)
[2] AnotherProject
Selection [1]: 1
Which agent?
[1] Claude Code
[2] OpenAI Codex
[3] Gemini CLI
Selection [1]: 1
=== Starting Claude Code for MyProject ===
Agent works → session ends → sandbox is deleted → code stays.
Only agents that are both enabled in config.json and installed in the template are shown.
Want to watch the agent edit files in real-time? agentbox can use VS Code as the launcher instead of (or alongside) Windows Terminal.
On the first install.ps1 run you're asked once:
Pick launcher for the agentbox shortcut:
[1] Windows Terminal (default — lean, proven)
[2] VS Code (live file-watch + agent-terminal in the editor)
[3] Both (two shortcuts — you decide per click)
Pick [2] (or [3]) and agentbox wires everything for you — including winget-installing VS Code itself if it's missing (user-scope, no admin):
- An
agentboxterminal profile is smart-merged into your usersettings.json(existing settings untouched; JSONC with comments is left alone and shown as a copy-paste snippet). - A workspace file (
agentbox.code-workspace) opens your project root (AI_Projects_Source\) in VS Code. - A task with
runOn: folderOpenstarts the agent in a dedicated terminal panel the moment the workspace opens — confirm VS Code's one-time "trust this workspace" prompt and it's hands-off from there.
Result on double-click: VS Code opens → agent boots in the terminal panel → every file the agent writes shows up live in the Explorer tree, auto-reloads in the editor, and shows diffs in the Git gutter. No container setup, no VS Code Server, no browser tab — native Windows fsnotify picks up the changes through the WSL bind-mount.
Unlike other "agentbox"-style projects that rely on Docker devcontainers (extension dance, trust prompts, devcontainer.json to maintain) or VS Code Server in a browser tab, this is your local native VS Code — zero plugins required, zero container overhead on file I/O.
To change later, edit launch_ui in config.json (wt | vscode | both) and re-run install.ps1. The choice persists across updates.
The agent sees only:
/workspace/ ← project root and agent start directory
src/ (read-write) Your code
assets/ (read-only) Static files
_tasks/ (read-write) Task triggers
CLAUDE.md (read-write) Session context
project.json (read-only) Configuration
The agent starts in /workspace/, so the complete project layout is visible on the first ls. Projects without a src/ subfolder get their root bind-mounted as /workspace/src/.
The agent does not see: /mnt/c/, OneDrive, ~/.ssh/, other projects, _control/.
Directory mounts use nosymfollow + nodev; hardlink protection is enforced via sysctl.
agentbox protects your machine from the agent, not the internet from the agent.
iptables rules in the sandbox enforce:
| Allowed | Blocked |
|---|---|
| Outbound HTTPS/HTTP to any public IP | Access to private ranges (10/8, 172.16/12, 192.168/16, 169.254/16, 127/8) |
| DNS (port 53) | All non-HTTP(S) ports |
The private-range drops are the important bit: they stop the agent from reaching your Windows host, LAN services, metadata endpoints, or other WSL distros. That's the client-protection threat model.
What agentbox does NOT do: per-domain egress filtering. iptables can't match hostnames reliably because CDNs rotate IPs mid-request, so there's no whitelist enforced on the actual packets. An agent with network access can reach any public HTTPS endpoint while a session is running. If that's in your threat model, you need an egress proxy — agentbox doesn't ship one.
.wslconfig: configurable viaconfig.json(default: 4 GB RAM, 2 CPUs, 1 GB swap)- RAM watchdog: Warns via Windows dialog when sandbox exceeds threshold (default: 90%)
- Protection against runaway loops that freeze the host
The agent cannot execute anything itself. It writes a task file, a Windows-side runner validates:
- Build command on whitelist? → Execute
- Deploy target on whitelist? → Execute
- Everything else → Rejected. No wildcards, no prefix matching.
Both whitelists are configurable in config.json.
The sandbox distro itself is disposable, but two layers on the Windows host survive session boundaries and get bind-mounted into each new sandbox:
- Package caches:
%LOCALAPPDATA%\agentbox\cache\npmand…\cache\pip— sonpm install/pip installdon't re-download between sessions. Trade-off: an agent could theoretically poison the cache for a future session. - Per-agent auth dirs:
%LOCALAPPDATA%\agentbox\auth\{claude,codex,gemini,aider,goose}— so you don't have to log into each CLI on every session. Each agent gets its own subdir; within a session only the active agent's auth is mounted, so agents can't see each other's tokens.
Both live under %LOCALAPPDATA%\agentbox\ (not in your _control/ folder), so OneDrive doesn't sync binary caches or tokens. Delete either tree on the Windows side if you want a fully fresh start.
All settings live in config.json (optional — all values have built-in defaults):
| Setting | Default | Description |
|---|---|---|
base_path_override |
"" (OneDrive) |
Custom project storage path |
base_dir_name |
AI_Projects_Source |
Project root folder name |
control_dir_name |
_control |
Control directory name |
sandbox_user |
agent |
Unprivileged user in sandbox |
resources_memory |
4GB |
WSL2 memory limit |
resources_processors |
2 |
WSL2 CPU cores |
resources_swap |
1GB |
WSL2 swap size |
resources_ram_warn_percent |
90 |
RAM watchdog threshold (%) |
resources_watchdog_interval |
30 |
Watchdog check interval (seconds) |
build_whitelist |
8 commands | Allowed build commands |
deploy_whitelist |
local, github |
Allowed deploy targets |
agent_*_enabled |
Big 3 on | Enable/disable agents |
auto_start_timeout |
5 |
Auto-start countdown (seconds) |
auto_update |
true |
Check for updates at startup |
auto_update_interval_hours |
24 |
Hours between update checks |
launch_ui |
"" (prompts once) |
Shortcut target: wt (Windows Terminal), vscode (VS Code with live file-watch), or both |
event_log_source |
AIProjects |
Windows Event Log source name |
scheduled_task_name |
agentbox-task-runner |
Windows Scheduled Task name |
See config.json for the full list with all defaults.
| Docker Dev Container | GitHub Codespaces | agentbox | |
|---|---|---|---|
| Requires Docker | Yes | No (cloud) | No |
| One-liner install | No | No | Yes |
| Agent isolation | Manual | Partial | Automatic |
| Network restriction | Manual | No | Automatic |
| Build/deploy whitelist | No | No | Yes |
| Disposable sessions | Manual | No | Automatic |
| Works offline | Yes | No | Yes |
| Cost | Free | From $0/month | Free |
| Setup time | 10–30 min | 5 min | 3–5 min |
Agents read CLAUDE.md at the start and update it at the end of each session. No context is lost. A backup (CLAUDE.md.bak) is automatically created before each session.
Run the same task with different agents and compare the results — deterministically.
Every session automatically creates a snapshot (code + CLAUDE.md before the agent starts) and a diff (all changes the agent made). This enables:
# 1. Run a task with Claude Code
agentbox
# → Session-ID: 20260411_143000_claude_MyProject
# 2. Replay the same starting point with a different agent
agentbox --replay 20260411_143000_claude_MyProject
# → Choose a different agent (e.g., Codex or Aider)
# → Session-ID: 20260411_150000_codex_MyProject
# 3. Compare what each agent did
agentbox --compare 20260411_143000_claude_MyProject 20260411_150000_codex_MyProject| Command | Description |
|---|---|
agentbox --list-sessions |
List all recorded sessions |
agentbox --replay <session-id> |
Restore snapshot, run with another agent |
agentbox --compare <id1> <id2> |
Side-by-side diff of two sessions |
- Code changes: Full unified diff of all files modified by each agent
- CLAUDE.md changes: How each agent documented their work
- Session metadata: Agent name, timestamp, project
This is useful for evaluating which agent handles specific tasks best, or for verifying that a refactoring produces equivalent results across agents.
After each session, agentbox lists connection attempts the sandbox host-protection rules rejected — anything that tried to go somewhere other than HTTPS/HTTP on a public IP:
=== Blocked connection attempts ===
(not 443/80 or to private networks — host-protection rules matched)
[BLOCKED] internal-service.local (10.0.0.42)
[BLOCKED] 203.0.113.42
Typical entries: the agent tried to reach your Windows host (172.x, 127.0.0.1), your LAN (192.168.x), or a non-web port. If you see hits on a domain you actually need — e.g. a private artifact mirror — the current build of agentbox has no per-host whitelist knob; you'd need to loosen the iptables rules in wsl-sandbox-init.sh yourself.
agentbox keeps code/config and runtime state in two separate trees on purpose — the former is versioned and cloud-syncable, the latter is binary/sensitive and stays on the local disk.
Versioned tree (your projects folder, OneDrive-friendly):
AI_Projects_Source\ (or your custom path)
+-- _control\ # Cloned from this repo; syncs with OneDrive
| +-- config.json # Central configuration
| +-- install.ps1 # Bootstrap from GitHub
| +-- win-setup.ps1 # One-time: build template
| +-- win-setup-core.ps1 # Template builder (called by install.ps1)
| +-- win-task-runner.ps1 # Build/deploy runner
| +-- wsl-ai-start.sh # Project/agent selection
| +-- wsl-sandbox-init.sh # Sandbox initialization
| +-- type_defaults.json # Type detection + defaults
| +-- SYSTEM_META_PROMPT.md # Agent contract
| +-- refactor.md # Architecture-cleanup roadmap
| +-- lib\
| +-- config.sh # Bash config helper
+-- MyProject\
| +-- project.json
| +-- CLAUDE.md
| +-- src\
| +-- assets\
| +-- _tasks\
Runtime tree (local, never syncs to the cloud):
%LOCALAPPDATA%\agentbox\
+-- sandbox\
| +-- template.vhdx # Primary (WSL 2.0+); ~3-5s copy on SSD
| +-- template.tar.gz # Fallback for WSL < 2.0.x only
| +-- .config_hash # Skip-build check
+-- cache\
| +-- npm\ # Persistent npm cache
| +-- pip\ # Persistent pip cache
+-- sessions\ # Replay snapshots + diffs
+-- auth\
| +-- claude\ # Claude Code OAuth + tokens
| +-- codex\
| +-- gemini\
| +-- aider\
| +-- goose\
+-- host-distro\ # Persistent host-WSL distro (default)
The split is enforced:
_control/only holds versioned scripts + config, never cache or token data. OneDrive's Files-on-Demand can't place-hold binary state, and secrets have no business in the cloud by default.
- Windows 10 (2004+) or Windows 11 + WSL2 (auto-installed if missing)
- Admin privileges (one-time only)
- Git (optional — used for faster updates, not required)
- No Docker. No Kubernetes. No cloud.
- WSL2 kernel exploits (Microsoft's responsibility)
- Malicious code in the project folder (the agent has r/w there — by design)
- DNS tunneling (theoretically possible, practically irrelevant)
- Not a multi-user system (one developer, one machine)
We document this because security claims only count when you're honest about the boundaries.