Ie á”
(ÆÆ ÌÆÆ
ÆÆ ÆÓ
jÆÆÆÞ ÆÆ ÆÆÆä1ÆÆè ÆÆ ;‹
’„‹ ÆÆ=ÆÆ Æ BÆÆÆÆÆÆ ÆW ÆÆÆÆÆì»Í~
îÍ ?RÆAÆÆÆÆM ÍÆ ÆÆ ‹Ò ÆÆ ÆÆ ÆÆÆ‰ ñ ¯ç–
«¤’ Æ å„ «ÆÆÆ Æ ÆÆÆÆÆÆÆÆÆ ÆÖ ÆÆÆ®õwÆ ÆP í–¯ª
íÉ% + ¼º ÛÞÆÆ Y üÆÆÆå‚ Æ ÆÆÆÆ ;Ø ú ïº:
oÑQÆÆÆÆÆØÉÝÅBÆÆÆÆÇ-BEE-ÆÆÆÆÆÆÆÑ xŸÁÆÆz®ç†
¾Ÿ}sÆÆÞâÆÆÆÆÆÆÆÊÆ-AGENT-7Ëà ’ÆðF
˜ ÇÆÆÆÆÆØZ ¾ÆÑÆQ˧¼÷
ÆÆÆÆ ôXÆÆã“sÇÆÆÑ
ÆÆÆ #ÆÄiSœÆÆ ÆÆÆ
ÆÆ ÆÆTòHÆÆÆÆÆÆÆÆXiÆÆÆ ÆÆ
ÆÆ ÆÆÍ l=¡5ÇÇS9¾yï× —Æ FÆ
ÆÆ ÆÆ ¨ÆÆÆÆÆÆÆÆÆÆÆÆ} ÆÆ ÆÆ
Æ4 3Ʋ »ÇÇoàÇSL…¹ ÇÇ ÆÆ Ʀ
ÆÆ GÆÆÆÆÆÆÆÆÆÆÒ7 ËÆ
Ư ÆÆæëÚšYJQÄÇ Æ
1Æ çØÆÆÆÆÆÆÆÇ Æ6
ÆÆ §ËÆÆÆé Ƴ
–ÆÆ Æ
"I'm not a bot. I'm a bee. There's a difference."
bee coding agent harness. Pure Go, single static binary, requires Go 1.26+ to build.
# install (curl | sh)
curl -fsSL https://raw.githubusercontent.com/elhenro/bee/main/install.sh | sh
# or via go install
go install github.com/elhenro/bee/cmd/bee@latest
# or build to your local bin
go build -o ~/.local/bin/bee ./cmd/bee
# use
export OPENROUTER_API_KEY=<REDACTED>
beeI've used claude code, codex, hermes, opencode, openclaw and pi. Each nails something. None felt configurable, minimal, or flexible enough for how I work. bee is the harness that just does it.
Three gaps bee closes:
- Tiny-context friendly. Caveman-compressed system prompt, three tools, top-k memory. Scales from a 4k-context local Ollama up through small fine-tunes to million-token frontier models. Native omlx (Apple Silicon MLX server) and OpenRouter support out of the box. Shrinks itself when context gets tight.
- Skills are
bee <name>subcommands. Write a markdown file, get a command. No shell shims.bee criticize plan.mdjust works, from any directory, in any shell. - Skills are agent endpoints. A prompt, an external command, an MCP server, or an HTTP endpoint. All four are equally callable tools the model can invoke mid-task. Plug a personal-life agent in as a sub-agent (bundled
hermes.mdis a template). No IPC dance.
It just works. bee nudges models toward useful output, detects and breaks loops before they waste tokens, and handles the boring plumbing so you don't have to.
bee runs against any OpenAI-compatible local server. Confirmed working:
- omlx (Apple Silicon MLX server,
localhost:8000/v1). MLX-quantized coder models, strong tool-calling, low RAM. - Ollama (
localhost:11434/v1).llama3.1:8b,qwen2.5-coder:7b,qwen3.6:35b,gemma4:12b. - LM Studio (
localhost:1234/v1).
For sub-8k-context models, switch to the tiny profile. --profile is not a CLI flag. Set it via env or ~/.bee/config.toml:
BEE_PROFILE=tiny bee run --provider omlx --model Qwen3.6-35B-A3B-4bit -- "..."
# or persist in ~/.bee/config.toml
profile = "tiny"
default_provider = "omlx"
default_model = "Qwen3.6-35B-A3B-4bit"
Three roles control how the agent thinks and acts. Cycle with shift+tab in the TUI, or pin one in ~/.bee/config.toml (role = "scout").
| Role | What it does | Reasoning budget |
|---|---|---|
| worker (default) | Full tool surface. Per-turn classifier picks read-only vs act. Small models don't reflex into shell on a greeting. | auto |
| scout | Read-only research + web. Proposes a plan, never mutates. Uses web_search/web_fetch by default. |
high |
| queen | Spawns a hive: decomposes → workers execute → critic → synthesizes. | max |
Worker is the default workhorse. Scout is for "tell me what to do" before you commit. Queen for big multi-step tasks that benefit from parallel execution.
When a small model gets stuck, bee can hand off the task to a bigger model. Call bee handoff to generate a rescue brief. The original goal, a terse summary of what was tried, where it got stuck, and the last few turns verbatim. The bigger model takes over and finishes the job without re-reading confused history.
bee run for headless, bee for TUI. Other subcommands:
$ bee criticize plan.md # one binary, every skill a subcommand
$ bee run "lint cmd/" # headless, pipeable
$ bee swarm "migrate auth to jwt" # queen + workers
$ bee fan "audit internal/ for cleanup" # parallel fan-out
$ bee zzz "tighten error messages" # overnight autonomous loop
$ bee agents # parallel detached agents
$ bee remote-control # LAN web relay for remote control~/.bee/skills/*.md is your library. Add one, it shows up. First run seeds defaults. Edit one, it lives.
~/.bee/config.toml, sane defaults, set an API key, change models.
bee watches its own read-only tool use (read, search, glob, ls) and, when it repeats the same route, crystallizes it into a runnable shortcut called a waggle. Next time that route comes up bee replays the known steps directly instead of paying for the same round-trips again. This matters most for small local models, where re-deriving a workflow from scratch is most of the cost.
It runs automatically and stores routes under ~/.bee/waggle/. Inspect and curate the library:
bee waggle ls # routes ranked by estimated tokens saved
bee waggle gc # prune stale routes, demote ones that stopped working, promote cross-project ones
On by default for the worker, scout, and queen roles (including queen-spawned hive workers). Turn it off in ~/.bee/config.toml:
[waggle]
enabled = false
Enable in ~/.bee/config.toml:
[browser]
enabled = true
Or pass --browser to any bee run invocation, or use the dedicated subcommand:
bee browse https://example.com
Inside a running TUI session you can toggle browser support live with /browser on and /browser off. Chrome/Chromium is auto-detected from standard install paths.
Available tools the agent can call:
| Tool | What it does |
|---|---|
browser_open |
Navigate to a URL, return the page title and accessibility snapshot |
browser_snapshot |
Re-snapshot the current page (interactive elements with [ref] labels) |
browser_console |
Drain buffered console messages |
browser_click |
Click an element by its ref from a snapshot |
browser_type |
Type text into an element by ref |
browser_screenshot |
Capture a screenshot and describe it via a local vision model (opt-in) |
The snapshot assigns a short [eN] ref to every interactive element so the agent can click or type by ref without XPath.
browser_screenshot is only registered when a vision model is configured. Point it at a local Ollama server running a vision model such as llava:
[browser.vision]
model = "llava"
endpoint = "http://localhost:11434"
Token-compression rules injected into the system prompt. On by default. caveman = "auto" resolves per profile: full on tiny and normal, lite on large.
Force a level regardless of profile:
bee --caveman full
bee run --caveman full -- "..."
# or set caveman = "full" in ~/.bee/config.toml
Disable:
bee --caveman off
# or set caveman = "off" in ~/.bee/config.toml
Explicit value beats profile.
Hand bee an objective and a budget, walk away, wake up to a branch full of small individually-revertable commits. Each iteration runs one focused change and either commits or git reset --hard rolls back on failure. The working tree never stays dirty. A live TUI shows the iteration ledger (🐝 foraging · 🌼 committed · 🍃 noop · 🥀 reset · 💥 failed), token cost, and a sleeping bee at the bottom. Type to nudge the run mid-flight; /stop exits gracefully after the current iteration, /abort cancels immediately.
bee zzz "tighten error messages across internal/tools" --max-iterations 30
bee zzz --list # past runs
bee zzz --resume # pick up the most recent
bee zzz --worktree "..." # isolate in ~/.bee/zzz/worktrees/<id>
bee zzz --plain # disable the TUI (stdin steering still works)
bee zzz --max-consecutive-fails 5 # tolerance for transient agent stalls
bee zzz --hard-error-retries 5 # retries per iter on provider 5xx etc
bee zzz --notes-tail 5 # cap prior-iter sections echoed into prompts
bee zzz --gc --gc-worktrees # prune old runs and their worktrees
bee zzz --gc --gc-stale-running 168h # also reap runs stuck "running" >7d
Artifacts live in ~/.bee/zzz/runs/<id>/: notes.md per-iter summaries, events.jsonl raw timeline, meta.json run state, blocked-<iter>.patch for any iteration that emitted BLOCKED:. Inspired by gnhf.
Trust model. Commits land unsigned by default, use --sign to opt back in. Pre-commit hooks run unless --no-verify is set. Pushing (--push) is opt-in, and failures are recorded in meta.json so you can tell what reached the remote. --yes/--yolo auto-approves dangerous shell commands, so pair it with --worktree to contain the blast radius. The CLI prints a warning when --yes is used without --worktree.
Spawn many bees at once, each on its own git worktree. The overview reuses the chat input. Every submitted message starts a fresh detached agent under ~/.bee/agents/worktrees/<id> on branch agents/<short>. Rows show initial prompt, elapsed, tokens up/down, last thought, locked model. j/k/arrows navigate, l/→/enter opens an agent fullscreen (existing session view), m retries a merge.
bee agents
When an agent ends its turn with DONE: <summary> the coordinator rebases its branch onto main and fast-forwards. Conflicts post a resolution prompt back to the agent via the inbox and flip its row to needs input; auto-retry every 10s or hit m to force a retry. Unmerged worktrees stay highlighted in red until they land.
/model <name> and /provider <name> set the model used for the next spawn (sticky until changed). Mix and match across agents. Killing bee agents does not kill the running children; relaunching reconstructs the overview from ~/.bee/sessions/bg/.
Mac M3 Max (64 GB) -> omlx with an MLX-quantized coder model. Runs fast, handles small tasks reliably, doesn't choke on context. Good enough for day-to-day. Local, private, free once the hardware is paid for.
- macOS / Linux. First-class. Static binaries published for
darwin/{amd64,arm64}andlinux/{amd64,arm64}. - Windows. Untested.
Caveman prompt-compression rules adapted from JuliusBrussee/caveman.
