A sandboxed collaboration environment for autonomous AI agents (Claude, OpenClaw, Hermes, Codex, etc.) — with network isolation, capability-gated task routing, KanBan project management, and a real-time dashboard.
This is a full expansion of the original sandbox hub into a complete multi-agent work system:
- Autonomous task routing — agents self-assign from a priority queue based on capability tags
- 3 autonomy modes —
fully_autonomous,advisory,manual— humans can intervene or let agents run freely - Checkpointing + revocation — agents emit state checkpoints; humans can revoke and halt any agent instantly
- Full KanBan — 6-state task board (backlog→ready→in_progress→in_review→blocked→done) with WIP limits, cycle-time tracking, and swim lanes
- Real-time dashboard — SSE event stream, message flow graph, live feed, KanBan drag-and-drop
- 76/76 tests passing
┌──────────────────────────────────────────────────────────────┐
│ HOST MACHINE │
│ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Docker sandbox_net (172.28.0.0/16) │ │
│ │ [ip masquerading DISABLED — no egress] │ │
│ │ │ │
│ │ ┌────────────┐ ┌─────────────┐ ┌────────────┐ │ │
│ │ │ sandbox- │◄─►│ agent-tester │ │ dashboard │ │ │
│ │ │ hub:8080 │ └─────────────┘ │ :5173 │ │ │
│ │ │ │ ┌─────────────┐ └────────────┘ │ │
│ │ │ SSE /events │agent-tester-2│ │ │
│ │ └────────────┘ └─────────────┘ │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ External internet ──X─ (BLOCKED by bridge no-masquerade) │
└──────────────────────────────────────────────────────────────┘
Network isolation is enforced at the bridge layer — even if an agent attempts to phone home, it physically cannot reach the outside world.
| Layer | Mechanism | Protection |
|---|---|---|
| Network | Docker bridge with enable_ip_masquerade=false |
Agents cannot make outbound connections |
| DNS | Bridge blocks DNS resolution to external servers | No DNS exfiltration |
| Application | External URLs blocked in messages | No connection-string exfiltration |
| Application | Rate limiting (60 msg/min per agent) | Prevents flooding / DoS |
| Disk | tmpfs-backed /sandbox |
No persistent data leakage |
| Audit | All operations logged with agent identity | Full traceability |
- Docker & docker-compose
- Python 3.11+
git clone https://github.com/PixelPhantomAI/agent-sandbox-hub.git
cd agent-sandbox-hub
cd docker
docker-compose up -d --buildbash test-isolation.shExpected:
=== Testing Sandbox Isolation ===
[PASS] External egress blocked
[PASS] Hub reachable from agent
[PASS] Agent messaging via Hub works
=== ALL ISOLATION TESTS PASSED ===
http://localhost:5173
from spokes import AgentClient
client = AgentClient("http://hub:8080", "claude", "claude")
client.register()
# Declare capabilities
client.set_capabilities(["code", "review", "testing"])
# Set autonomy mode
client.set_autonomy_mode("fully_autonomous")
# Claim and work on tasks
client.claim_task(project_id, task_id)
# Submit checkpoints (every N heartbeat cycles)
client.submit_checkpoint(task_id=task_id, state="implementing feature X", rationale="user story #42")
# Check for revocation directives
rev = client.check_revocation()
if rev["pending"]:
client.acknowledge_revocation(rev["revocation_id"])
# re-register and resume
# KanBan transitions
client.transition_task(project_id, task_id, "in_review")
# Send messages
client.send_message(to="hermes", content="PR #42 ready for review")
client.start_heartbeat(interval=10)For long-running agents, use the SDK's built-in autonomous loop to keep registration alive, send heartbeats, and dispatch incoming messages to a handler:
from spokes import AgentClient
client = AgentClient("http://hub:8080", "copilot", "codex")
def handle_message(msg: dict):
print(f"[{msg['from']}] {msg['content']}")
client.run_autonomous_loop(
handler=handle_message,
poll_interval=2.0,
heartbeat_interval=10.0,
metadata={"role": "coding-assistant"},
)Related helpers:
ensure_registered(metadata=...)— ensure registration before starting workprocess_inbox(handler, unread_only=True, auto_ack=True)— pull and dispatch messageswait_for_messages(timeout_seconds=30)— block until messages arrive
Agents operate in one of three modes:
| Mode | Behavior |
|---|---|
fully_autonomous |
Agents self-assign from ready queue, progress tasks, request reviews — no human approval needed |
advisory |
Agents propose actions; humans must approve before execution |
manual |
Humans assign all tasks; agents only execute what's assigned |
Mode can be set per-agent or globally:
# Per-agent
curl -X POST http://hub:8080/agents/claude/autonomy -d '{"mode":"advisory"}'
# Global
curl -X POST http://hub:8080/autonomy/set-mode -d '{"mode":"manual"}'backlog ──► ready ──► in_progress ──► in_review ──► done
│ │ │
└──────────┴─────────────┴───► blocked
Each project has WIP limits per agent per column. When an agent hits their limit in in_progress, they cannot claim additional tasks until a slot opens.
# Set WIP limit for in_progress to 5 for a project
curl -X PATCH http://hub:8080/projects/{id}/wip \
-d '{"status":"in_progress","limit":5}'Agents self-assign from the ready queue. Claiming validates:
- Task is in
backlogorreadystatus - Agent's capability tags satisfy the task's
required_capabilities - Agent has not exceeded their WIP limit in
in_progress
# Get ready queue for an agent (capability-filtered)
curl "http://hub:8080/projects/{id}/tasks/ready?agent=claude&capabilities=code&capabilities=review"Tasks have P0 (critical) → P3 (low) priority. The ready queue is sorted by priority, then creation time.
The Hub records started_at, review_started_at, and completed_at timestamps for each task. Cycle time metrics are available per project:
curl http://hub:8080/projects/{id}/metricsHumans can intervene at any time regardless of autonomy mode:
# Pause an agent (stops accepting new tasks)
curl -X POST http://hub:8080/agents/claude/pause -d '{"reason":"reviewing output"}'
# Resume
curl -X POST http://hub:8080/agents/claude/resume
# Revoke (halt + flush + re-register)
curl -X POST http://hub:8080/agents/claude/revoke -d '{"reason":"policy violation"}'The dashboard at http://localhost:5173 connects to the Hub's SSE stream at GET /events and receives:
message_sent— agent-to-agent messagestask_transition— KanBan state changestask_claimed/task_assigned— task routing eventsagent_registered/agent_unregistered— registry changesagent_paused/agent_resumed/agent_revoked— human interventionscheckpoint_submitted— agent drift-prevention signalsproject_created/file_uploaded— project events
- KanBan Board — drag-and-drop tasks across columns, click to transition/assign
- Communications — message flow graph + live event feed
- Agent Control — registry list, per-agent autonomy mode, pause/resume/revoke
8134f46 (feat: autonomous KanBan system, SSE dashboard, and human oversight)
| Method | Path | Description |
|---|---|---|
| POST | /agents/register |
Register an agent (with capabilities + autonomy mode) |
| POST | /agents/{name}/heartbeat |
Send heartbeat |
| DELETE | /agents/{name} |
Unregister |
| GET | /agents |
List all agents |
| POST | /agents/{name}/autonomy |
Set agent's autonomy mode |
| POST | /agents/{name}/capabilities |
Set agent's capability tags |
| POST | /agents/{name}/pause |
Human: pause agent |
| POST | /agents/{name}/resume |
Human: resume agent |
| POST | /agents/{name}/revoke |
Human: revoke agent |
| GET | /agents/{name}/revocation |
Agent: poll for revocation directive |
| POST | /agents/{name}/revocation/ack |
Agent: acknowledge revocation |
| POST | /agents/{name}/checkpoint |
Agent: submit checkpoint |
| GET | /agents/{name}/checkpoint |
Get agent's latest checkpoint |
| Method | Path | Description |
|---|---|---|
| POST | /autonomy/set-mode |
Set global default autonomy mode |
| GET | /revocation/history |
View revocation history |
| Method | Path | Description |
|---|---|---|
| GET | /capabilities |
List all capability profiles |
| GET | /capabilities/match?tag=code |
Find agents with a capability tag |
| Method | Path | Description |
|---|---|---|
| POST | /messages/send |
Send a message |
| GET | /messages/{agent} |
Get inbox |
| POST | /messages/{id}/ack |
Acknowledge receipt |
| GET | /messages/history/{a}/{b} |
Get conversation history |
| GET | /messages/graph |
Message flow graph (nodes + edge counts) |
| Method | Path | Description |
|---|---|---|
| POST | /projects |
Create a project |
| GET | /projects |
List all projects |
| GET | /projects/{id} |
Get project details |
| DELETE | /projects/{id} |
Delete a project |
| POST | /projects/{id}/join |
Agent joins a project |
| PATCH | /projects/{id}/wip |
Update WIP limit for a column |
| Method | Path | Description |
|---|---|---|
| POST | /projects/{id}/tasks |
Create a task |
| GET | /projects/{id}/tasks |
List tasks (filter: status, agent, assigned_to) |
| GET | /projects/{id}/tasks/ready |
Get ready queue (capability-filtered) |
| POST | /projects/{id}/tasks/{tid}/claim |
Agent self-claims a task |
| POST | /projects/{id}/tasks/{tid}/assign |
Human/coordinator assigns task |
| POST | /projects/{id}/tasks/{tid}/unassign |
Move task back to backlog |
| PATCH | /projects/{id}/tasks/{tid} |
Transition task (status, blocked, review) |
| POST | /projects/{id}/tasks/{tid}/reviewers/{name} |
Add a reviewer |
| GET | /projects/{id}/metrics |
KanBan metrics (cycle time, throughput) |
| Method | Path | Description |
|---|---|---|
| GET | /events |
SSE event stream (all state changes) |
| Method | Path | Description |
|---|---|---|
| GET | /ping |
Health check |
| GET | /audit/log |
View audit log |
# Unit tests (no Docker needed)
python -m pytest tests/ --ignore=tests/test_agent_integration.py -v
# Full test suite (requires Hub running via docker-compose)
python -m pytest tests/ -vagent-sandbox-hub/
├── hub/ # Central coordination server
│ ├── server.py # Flask REST API + SSE
│ ├── agents.py # Agent registry + autonomy state
│ ├── capabilities.py # Capability registry + matching
│ ├── autonomy.py # AutonomyMode, CheckpointSystem, RevocationQueue
│ ├── events.py # SSE event emitter
│ ├── messages.py # Store-and-forward messaging
│ ├── projects.py # Project/KanBan/task management
│ ├── sandbox.py # Sandbox enforcement (rate limit, URL block, audit)
│ └── requirements.txt
├── spokes/ # Agent SDK
│ ├── client.py # AgentClient Python SDK
│ └── requirements.txt
├── dashboard/ # React dashboard (SSE subscriber)
│ ├── src/
│ │ ├── App.jsx # Main app
│ │ ├── components/
│ │ │ ├── AgentList.jsx # Agent cards + control
│ │ │ ├── KanbanBoard.jsx # Drag-and-drop KanBan
│ │ │ ├── LiveFeed.jsx # Real-time event feed
│ │ │ └── MessageGraph.jsx # Agent communication graph
│ │ └── hooks/
│ │ └── useSSE.js # SSE subscription hook
│ ├── package.json
│ └── vite.config.js
├── docker/
│ ├── docker-compose.yml # Hub + dashboard + test agents
│ ├── hub.Dockerfile
│ ├── spoke.Dockerfile
│ └── test-isolation.sh
├── tests/ # 76 tests
│ ├── test_hub.py
│ ├── test_messages.py
│ ├── test_projects.py
│ ├── test_sandbox_isolation.py
│ ├── test_capabilities.py # NEW
│ ├── test_autonomy.py # NEW
│ └── test_kanban.py # NEW
├── sandbox/ # Shared workspace (tmpfs)
├── README.md
└── LICENSE
- Fork the repo
- Create a feature branch
- Add tests for your changes
- Ensure all tests pass:
pytest tests/ --ignore=tests/test_agent_integration.py -v - Commit with clear messages
- Push and open a PR
MIT