Theo Code

Autonomous coding agent with deep code understanding

Terminal-native. Desktop-ready. Knows your codebase like you do.

What it does

Theo Code is an autonomous coding agent that reads, plans, edits, and verifies code changes inside large repositories. It packages four things into one workspace:

Code intelligence — Tree-Sitter parser (14 languages) with agentic-search retrieval: the agent reaches for code via grep, codesearch, and glob tools.
Agent runtime — state machine (Plan → Act → Observe → Reflect), sub-agent fan-out, budget enforcer, sandboxed tool execution.
Provider abstraction — 26 LLM provider specs (Anthropic, OpenAI, xAI, Mistral, Groq, Cohere, Vertex, Bedrock, Ollama, vLLM, …) sharing one streaming/retry/converter pipeline.
Surfaces — theo CLI (14 subcommands), Tauri desktop, Vite UI, and a Python benchmark harness in apps/theo-benchmark/.

Every claim on this page is a gate you can run yourself.

Quickstart

Build from source

git clone https://github.com/usetheodev/theo-code
cd theo-code
cargo build --workspace --exclude theo-code-desktop --release
./target/release/theo --help

System requirements: Rust 1.83+ (2024 edition), pkg-config. The desktop app additionally needs libgtk-3-dev and Tauri prerequisites on Linux; everything else builds without system deps.

First run

# Authenticate with a provider (OAuth device flow or API key)
theo login                                # interactive picker
theo login --provider anthropic           # pin a provider

# Single-shot task
theo "find every place that constructs a session token"

# Autonomous loop (Plan → Act → Observe → Reflect until done)
theo pilot "remove the panic on stale tool name and add a regression test"

# Interactive TUI
theo

Useful one-liners

theo memory lint                        # memory subsystem hygiene
theo dashboard                          # observability HTTP server
theo subagent ls                        # persisted sub-agent runs
theo checkpoints ls                     # workdir shadow-git checkpoints
theo skill ls                           # installed skills
theo mcp list                           # cached MCP servers
theo trajectory export-rlhf --out f.jsonl   # export rated runs

CLI surface

theo --help lists 14 top-level subcommands, pinned by the smoke test apps/theo-cli/tests/cli_help_smoke.rs::test_top_level_subcommand_count_is_fourteen:

agent        Interactive REPL or single-shot task execution
pilot        Autonomous loop until promise is fulfilled
memory       Memory subsystem utilities (lint, inspect)
login        Authenticate with a provider (OAuth device flow or API key)
logout       Remove saved credentials
dashboard    Start the observability dashboard HTTP server
subagent     Manage persisted sub-agent runs
checkpoints  Manage workdir checkpoints (shadow git repos)
agents       Manage project agents approval
mcp          Manage MCP discovery cache
skill        Skill catalog: list / view / delete user-installed skills
trajectory   Trajectory export tooling
evals        Context Engineering evals (CDLC L2-L4)
help         Print help for any subcommand

Architecture

Cargo workspace with 14 lib crates + 3 binary apps under one Rust 2024 edition tree.

crates/
├── theo-domain                  pure types, state machines, zero deps
├── theo-engine-parser           Tree-Sitter extraction (14 langs)
├── theo-engine-retrieval        search + context assembly
├── theo-governance              policy engine, sandbox cascade
├── theo-isolation               bwrap / landlock / noop fallback
├── theo-infra-llm               26 provider specs, streaming, retry
├── theo-infra-auth              OAuth PKCE, device flow, env keys
├── theo-infra-mcp               Model Context Protocol client
├── theo-infra-memory            memory persistence (ADR-008 pending)
├── theo-test-memory-fixtures    fixtures for memory tests
├── theo-tooling                 49 registered tools + registry
├── theo-agent-runtime           agent loop, sub-agents, observability
├── theo-api-contracts           serializable DTOs for IPC
└── theo-application             use-cases, facade, CLI runtime re-exports

apps/
├── theo-cli         (pkg `theo`)        the CLI binary
├── theo-marklive                        markdown live renderer
├── theo-desktop                         Tauri shell (excluded from cargo test — GTK)
├── theo-benchmark                       Python harness (outside Rust workspace)
└── theo-ui                              Vite/TS UI (outside Rust workspace)

Dependency direction (ADR-010, enforced)

theo-domain              → (nothing)
theo-engine-parser       → theo-domain
theo-engine-retrieval    → theo-domain, theo-engine-parser
theo-governance          → theo-domain
theo-infra-*             → theo-domain
theo-tooling             → theo-domain
theo-agent-runtime       → theo-domain, theo-governance,
                           theo-infra-llm, theo-infra-auth, theo-tooling,
                           theo-isolation, theo-infra-mcp
theo-api-contracts       → theo-domain
theo-application         → all crates above
apps/*                   → theo-application, theo-api-contracts

scripts/check-arch-contract.sh enforces this on every PR.

Capabilities

LLM providers (26)

Every provider lives in crates/theo-infra-llm/src/provider/catalog/ as a ProviderSpec const. Adding one means dropping a new const and wiring its auth strategy.

amazon-bedrock         azure                  azure-cognitive-services
anthropic              cerebras               chatgpt-codex
cloudflare-ai-gateway  cloudflare-workers-ai  cohere
deepinfra              github-copilot         gitlab
google-vertex          google-vertex-anthropic groq
lm-studio              mistral                ollama
openai                 openrouter             perplexity
sap-ai-core            togetherai             vercel
vllm                   xai

OAuth device flow is supported for anthropic and chatgpt-codex. The rest use API keys (env or config).

Languages parsed (14)

C, C++, C#, Go, Java, JavaScript, Kotlin, PHP, Python, Ruby, Rust, Scala, Swift, TypeScript.

Agent tools (58 available to the agent)

49 tools registered in DefaultRegistry (pinned by snapshot test default_registry_tool_id_snapshot_is_pinned) plus 9 meta-tools injected by theo-agent-runtime at dispatch time.

Registry tools (49):

Category	Tool IDs
Filesystem	`read`, `write`, `edit`, `apply_patch`, `glob`, `grep`
Shell & process	`bash`, `env_info`
Git	`git_status`, `git_diff`, `git_log`, `git_commit`
HTTP	`http_get`, `http_post`, `webfetch`
Cognitive	`think`, `reflect`, `memory`, `task_create`, `task_update`
Memory	`store_memory`, `recall_memory`
Planning	`plan_create`, `plan_update_task`, `plan_advance_phase`, `plan_log`, `plan_summary`, `plan_next_task`, `plan_replan`, `plan_failure_status`
Multimodal	`read_image`, `screenshot`
Code intelligence	`codebase_context`, `docs_search`
Test generation	`gen_property_test`, `gen_mutation_test`
LSP sidecar	`lsp_status`, `lsp_definition`, `lsp_references`, `lsp_hover`, `lsp_rename`
Browser sidecar	`browser_status`, `browser_open`, `browser_click`, `browser_screenshot`, `browser_type`, `browser_eval`, `browser_wait_for_selector`, `browser_close`

Meta-tools (9, injected by runtime):

Tool	Purpose
`done`	Signal task completion
`skill`	Invoke auto-discovered skills
`delegate_task_single`	Spawn a sub-agent
`delegate_task_parallel`	Fan-out multiple sub-agents
`delegate_task_legacy`	Legacy delegation format
`batch`	Run up to 25 independent tool calls in parallel
`batch_execute`	Programmatic tool calling
`batch_for_subagent`	Batch variant for sub-agents
`tool_search`	Keyword lookup over deferred tools

Not registered (code exists, not wired to runtime):

25 additional tool implementations exist in crates/theo-tooling/src/ but are not in DefaultRegistry. These include: 11 DAP/debug tools, 3 wiki tools, codesearch, websearch, computer_action, ls, lsp (umbrella), multiedit, plan_exit, question, task, skill (registry version).

Sidecar status

Sidecar	Status	Notes
LSP	validated	E2E with rust-analyzer; `LspSessionManager::from_path()` discovers servers. 5 tools registered.
Browser	partial	Playwright sidecar at `crates/theo-tooling/scripts/playwright_sidecar.js`; requires Node + chromium. 8 tools registered.
DAP	implemented, not wired	11 `debug_*` tools with 140+ unit tests and `DapSessionManager` (415 LOC). Not registered in `DefaultRegistry`; no E2E smoke test.
Computer Use	implemented, not wired	`computer_action` tool (384 LOC) + platform driver (503 LOC, xdotool/cliclick). Not registered in `DefaultRegistry`.

Quality model

CI gates

make audit runs the composite suite. Each technique is independently runnable:

Technique	Command	What it enforces
Architecture contract	`make check-arch`	ADR-010 dep direction
File / function size	`make check-sizes`	800 LOC / file ceiling, allowlist with sunsets
Unwrap / expect	`make check-unwrap`	No `unwrap`/`expect` in production paths
Panic / todo	`make check-panic`	No `panic!()`/`todo!()`/`unimplemented!()` in production paths
Unsafe SAFETY comment	`make check-unsafe`	Every `unsafe` block has `// SAFETY:` within 8 lines above
Inline I/O tests	`make check-io-tests`	I/O tests live in `tests/`, not inline
Secrets scan	`make check-secrets`	`gitleaks` (or grep fallback)
Composite SOTA DoD	`make check-sota-dod`	Every Tier 1 + Tier 2 DoD criterion

CI workflow .github/workflows/audit.yml runs every gate on every PR.

Allowlists (with sunsets)

Pre-existing debt is tracked, not amnestied. Each .claude/rules/*-allowlist.txt has an entry-per-violation with a date column. check-* scripts fail when a sunset has elapsed.

Testing

Run the suite

cargo test --workspace --exclude theo-code-desktop --lib --tests --no-fail-fast

Contract tests (structural invariants)

Tests that pin invariants of the production surface:

test_top_level_subcommand_count_is_fourteen — pins the 14 subcommand count
every_subcommand_responds_to_help_with_exit_zero — every subcommand responds to --help
build_registry — every DefaultRegistry entry is reachable
agent_loop_new — the agent loop builder accepts only valid configs

Audit scripts (CI-enforced)

scripts/check-*.sh run as part of the SOTA DoD composite. See Quality model.

Benchmark

make check-bench-preflight                 # validate scenarios + harness
cd apps/theo-benchmark
python run_benchmark.py --suite smoke

Project rules

The repo carries rule files in .claude/rules/ that block CI when violated:

architecture.md — crate dep direction (ADR-010), prohibited imports
testing.md — TDD (regression test before fix), AAA, deterministic, independent
rust-conventions.md — thiserror, no unwrap in prod, newtypes
integration-first.md — features must be wired and tested end-to-end
domain-boundary.md — Memory/Context separation (D1, D4)

Together with .claude/rules/*-allowlist.txt files (each with sunsets), they form the project's hygiene contract.

Contributing

Read CLAUDE.md before changing anything.
TDD is inquebrável. Bug fixes need a regression test before the fix.
Update the changelog. Every PR adds an entry under [Unreleased] in CHANGELOG.md with a (#PR) reference.
Don't break the dependency contract. make check-arch must pass.
Don't widen allowlists without an ADR. Each entry has an ADR pointer and a sunset date.

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 1,002 Commits
.cargo		.cargo
.claude		.claude
.githooks		.githooks
.github		.github
.semgrep		.semgrep
.theo		.theo
apps		apps
crates		crates
scripts		scripts
.gitignore		.gitignore
.theoignore		.theoignore
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
clippy.toml		clippy.toml
deny.toml		deny.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Theo Code

What it does

Quickstart

Build from source

First run

Useful one-liners

CLI surface

Architecture

Dependency direction (ADR-010, enforced)

Capabilities

LLM providers (26)

Languages parsed (14)

Agent tools (58 available to the agent)

Sidecar status

Quality model

CI gates

Allowlists (with sunsets)

Testing

Run the suite

Contract tests (structural invariants)

Audit scripts (CI-enforced)

Benchmark

Project rules

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Theo Code

What it does

Quickstart

Build from source

First run

Useful one-liners

CLI surface

Architecture

Dependency direction (ADR-010, enforced)

Capabilities

LLM providers (26)

Languages parsed (14)

Agent tools (58 available to the agent)

Sidecar status

Quality model

CI gates

Allowlists (with sunsets)

Testing

Run the suite

Contract tests (structural invariants)

Audit scripts (CI-enforced)

Benchmark

Project rules

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages