Skip to content

usetheodev/theo-code

Repository files navigation

Theo

Theo Code

Autonomous coding agent with deep code understanding

Terminal-native. Desktop-ready. Knows your codebase like you do.

License Rust Languages Providers Tools


What it does

Theo Code is an autonomous coding agent that reads, plans, edits, and verifies code changes inside large repositories. It packages four things into one workspace:

  • Code intelligence — Tree-Sitter parser (14 languages) with agentic-search retrieval: the agent reaches for code via grep, codesearch, and glob tools.
  • Agent runtime — state machine (Plan → Act → Observe → Reflect), sub-agent fan-out, budget enforcer, sandboxed tool execution.
  • Provider abstraction — 26 LLM provider specs (Anthropic, OpenAI, xAI, Mistral, Groq, Cohere, Vertex, Bedrock, Ollama, vLLM, …) sharing one streaming/retry/converter pipeline.
  • Surfacestheo CLI (14 subcommands), Tauri desktop, Vite UI, and a Python benchmark harness in apps/theo-benchmark/.

Every claim on this page is a gate you can run yourself.


Quickstart

Build from source

git clone https://github.com/usetheodev/theo-code
cd theo-code
cargo build --workspace --exclude theo-code-desktop --release
./target/release/theo --help

System requirements: Rust 1.83+ (2024 edition), pkg-config. The desktop app additionally needs libgtk-3-dev and Tauri prerequisites on Linux; everything else builds without system deps.

First run

# Authenticate with a provider (OAuth device flow or API key)
theo login                                # interactive picker
theo login --provider anthropic           # pin a provider

# Single-shot task
theo "find every place that constructs a session token"

# Autonomous loop (Plan → Act → Observe → Reflect until done)
theo pilot "remove the panic on stale tool name and add a regression test"

# Interactive TUI
theo

Useful one-liners

theo memory lint                        # memory subsystem hygiene
theo dashboard                          # observability HTTP server
theo subagent ls                        # persisted sub-agent runs
theo checkpoints ls                     # workdir shadow-git checkpoints
theo skill ls                           # installed skills
theo mcp list                           # cached MCP servers
theo trajectory export-rlhf --out f.jsonl   # export rated runs

CLI surface

theo --help lists 14 top-level subcommands, pinned by the smoke test apps/theo-cli/tests/cli_help_smoke.rs::test_top_level_subcommand_count_is_fourteen:

agent        Interactive REPL or single-shot task execution
pilot        Autonomous loop until promise is fulfilled
memory       Memory subsystem utilities (lint, inspect)
login        Authenticate with a provider (OAuth device flow or API key)
logout       Remove saved credentials
dashboard    Start the observability dashboard HTTP server
subagent     Manage persisted sub-agent runs
checkpoints  Manage workdir checkpoints (shadow git repos)
agents       Manage project agents approval
mcp          Manage MCP discovery cache
skill        Skill catalog: list / view / delete user-installed skills
trajectory   Trajectory export tooling
evals        Context Engineering evals (CDLC L2-L4)
help         Print help for any subcommand

Architecture

Cargo workspace with 14 lib crates + 3 binary apps under one Rust 2024 edition tree.

crates/
├── theo-domain                  pure types, state machines, zero deps
├── theo-engine-parser           Tree-Sitter extraction (14 langs)
├── theo-engine-retrieval        search + context assembly
├── theo-governance              policy engine, sandbox cascade
├── theo-isolation               bwrap / landlock / noop fallback
├── theo-infra-llm               26 provider specs, streaming, retry
├── theo-infra-auth              OAuth PKCE, device flow, env keys
├── theo-infra-mcp               Model Context Protocol client
├── theo-infra-memory            memory persistence (ADR-008 pending)
├── theo-test-memory-fixtures    fixtures for memory tests
├── theo-tooling                 49 registered tools + registry
├── theo-agent-runtime           agent loop, sub-agents, observability
├── theo-api-contracts           serializable DTOs for IPC
└── theo-application             use-cases, facade, CLI runtime re-exports

apps/
├── theo-cli         (pkg `theo`)        the CLI binary
├── theo-marklive                        markdown live renderer
├── theo-desktop                         Tauri shell (excluded from cargo test — GTK)
├── theo-benchmark                       Python harness (outside Rust workspace)
└── theo-ui                              Vite/TS UI (outside Rust workspace)

Dependency direction (ADR-010, enforced)

theo-domain              → (nothing)
theo-engine-parser       → theo-domain
theo-engine-retrieval    → theo-domain, theo-engine-parser
theo-governance          → theo-domain
theo-infra-*             → theo-domain
theo-tooling             → theo-domain
theo-agent-runtime       → theo-domain, theo-governance,
                           theo-infra-llm, theo-infra-auth, theo-tooling,
                           theo-isolation, theo-infra-mcp
theo-api-contracts       → theo-domain
theo-application         → all crates above
apps/*                   → theo-application, theo-api-contracts

scripts/check-arch-contract.sh enforces this on every PR.


Capabilities

LLM providers (26)

Every provider lives in crates/theo-infra-llm/src/provider/catalog/ as a ProviderSpec const. Adding one means dropping a new const and wiring its auth strategy.

amazon-bedrock         azure                  azure-cognitive-services
anthropic              cerebras               chatgpt-codex
cloudflare-ai-gateway  cloudflare-workers-ai  cohere
deepinfra              github-copilot         gitlab
google-vertex          google-vertex-anthropic groq
lm-studio              mistral                ollama
openai                 openrouter             perplexity
sap-ai-core            togetherai             vercel
vllm                   xai

OAuth device flow is supported for anthropic and chatgpt-codex. The rest use API keys (env or config).

Languages parsed (14)

C, C++, C#, Go, Java, JavaScript, Kotlin, PHP, Python, Ruby, Rust, Scala, Swift, TypeScript.

Agent tools (58 available to the agent)

49 tools registered in DefaultRegistry (pinned by snapshot test default_registry_tool_id_snapshot_is_pinned) plus 9 meta-tools injected by theo-agent-runtime at dispatch time.

Registry tools (49):

Category Tool IDs
Filesystem read, write, edit, apply_patch, glob, grep
Shell & process bash, env_info
Git git_status, git_diff, git_log, git_commit
HTTP http_get, http_post, webfetch
Cognitive think, reflect, memory, task_create, task_update
Memory store_memory, recall_memory
Planning plan_create, plan_update_task, plan_advance_phase, plan_log, plan_summary, plan_next_task, plan_replan, plan_failure_status
Multimodal read_image, screenshot
Code intelligence codebase_context, docs_search
Test generation gen_property_test, gen_mutation_test
LSP sidecar lsp_status, lsp_definition, lsp_references, lsp_hover, lsp_rename
Browser sidecar browser_status, browser_open, browser_click, browser_screenshot, browser_type, browser_eval, browser_wait_for_selector, browser_close

Meta-tools (9, injected by runtime):

Tool Purpose
done Signal task completion
skill Invoke auto-discovered skills
delegate_task_single Spawn a sub-agent
delegate_task_parallel Fan-out multiple sub-agents
delegate_task_legacy Legacy delegation format
batch Run up to 25 independent tool calls in parallel
batch_execute Programmatic tool calling
batch_for_subagent Batch variant for sub-agents
tool_search Keyword lookup over deferred tools

Not registered (code exists, not wired to runtime):

25 additional tool implementations exist in crates/theo-tooling/src/ but are not in DefaultRegistry. These include: 11 DAP/debug tools, 3 wiki tools, codesearch, websearch, computer_action, ls, lsp (umbrella), multiedit, plan_exit, question, task, skill (registry version).

Sidecar status

Sidecar Status Notes
LSP validated E2E with rust-analyzer; LspSessionManager::from_path() discovers servers. 5 tools registered.
Browser partial Playwright sidecar at crates/theo-tooling/scripts/playwright_sidecar.js; requires Node + chromium. 8 tools registered.
DAP implemented, not wired 11 debug_* tools with 140+ unit tests and DapSessionManager (415 LOC). Not registered in DefaultRegistry; no E2E smoke test.
Computer Use implemented, not wired computer_action tool (384 LOC) + platform driver (503 LOC, xdotool/cliclick). Not registered in DefaultRegistry.

Quality model

CI gates

make audit runs the composite suite. Each technique is independently runnable:

Technique Command What it enforces
Architecture contract make check-arch ADR-010 dep direction
File / function size make check-sizes 800 LOC / file ceiling, allowlist with sunsets
Unwrap / expect make check-unwrap No unwrap/expect in production paths
Panic / todo make check-panic No panic!()/todo!()/unimplemented!() in production paths
Unsafe SAFETY comment make check-unsafe Every unsafe block has // SAFETY: within 8 lines above
Inline I/O tests make check-io-tests I/O tests live in tests/, not inline
Secrets scan make check-secrets gitleaks (or grep fallback)
Composite SOTA DoD make check-sota-dod Every Tier 1 + Tier 2 DoD criterion

CI workflow .github/workflows/audit.yml runs every gate on every PR.

Allowlists (with sunsets)

Pre-existing debt is tracked, not amnestied. Each .claude/rules/*-allowlist.txt has an entry-per-violation with a date column. check-* scripts fail when a sunset has elapsed.


Testing

Run the suite

cargo test --workspace --exclude theo-code-desktop --lib --tests --no-fail-fast

Contract tests (structural invariants)

Tests that pin invariants of the production surface:

  • test_top_level_subcommand_count_is_fourteen — pins the 14 subcommand count
  • every_subcommand_responds_to_help_with_exit_zero — every subcommand responds to --help
  • build_registry — every DefaultRegistry entry is reachable
  • agent_loop_new — the agent loop builder accepts only valid configs

Audit scripts (CI-enforced)

scripts/check-*.sh run as part of the SOTA DoD composite. See Quality model.

Benchmark

make check-bench-preflight                 # validate scenarios + harness
cd apps/theo-benchmark
python run_benchmark.py --suite smoke

Project rules

The repo carries rule files in .claude/rules/ that block CI when violated:

  • architecture.md — crate dep direction (ADR-010), prohibited imports
  • testing.md — TDD (regression test before fix), AAA, deterministic, independent
  • rust-conventions.mdthiserror, no unwrap in prod, newtypes
  • integration-first.md — features must be wired and tested end-to-end
  • domain-boundary.md — Memory/Context separation (D1, D4)

Together with .claude/rules/*-allowlist.txt files (each with sunsets), they form the project's hygiene contract.


Contributing

  1. Read CLAUDE.md before changing anything.
  2. TDD is inquebrável. Bug fixes need a regression test before the fix.
  3. Update the changelog. Every PR adds an entry under [Unreleased] in CHANGELOG.md with a (#PR) reference.
  4. Don't break the dependency contract. make check-arch must pass.
  5. Don't widen allowlists without an ADR. Each entry has an ADR pointer and a sunset date.

License

Apache-2.0

About

Theo is an open-source, terminal-native coding agent built in Rust. Unlike tools that bet everything on a single model, Theo focuses on what actually determines success: the harness — the sandbox, tools, context management, safety, and feedback loops that surround the model.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors