Autonomous AI coding pipeline for OpenCode.
Idea to shipped code • 13-agent code review • Adversarial model diversity • Background task management • Session recovery
A plugin for the OpenCode AI coding CLI that turns it into a fully autonomous software development system. Give it an idea — it researches, designs architecture, plans tasks, implements code, runs multi-agent code review, writes documentation, and extracts lessons for next time.
The core insight: adversarial model diversity. Agents that review each other's work use different model families. Your architect designs in Claude, your critic challenges in GPT, your red team attacks from Gemini — three different reasoning patterns catching three different classes of bugs.
Paste this into your AI session:
Install and configure opencode-autopilot by following the instructions here:
https://raw.githubusercontent.com/kodrunhq/opencode-autopilot/main/docs/guide/installation.md
The AI walks you through installation, model assignment for each agent group, and verification.
bunx @kodrunhq/opencode-autopilot installThis registers the plugin in .opencode.json and creates a starter config. Then run bunx @kodrunhq/opencode-autopilot configure to set up your model assignments.
This path requires Bun on PATH. The published CLI entrypoints use a Bun shebang and the plugin uses Bun-specific APIs.
Install:
npm install -g @kodrunhq/opencode-autopilotAdd to your OpenCode config (.opencode.json):
{
"plugin": ["@kodrunhq/opencode-autopilot"]
}Launch OpenCode. The plugin auto-installs agents, skills, and commands on first load and shows a welcome toast.
If you cannot or do not want to install Bun globally, use the GitHub Release local bundle path below instead of the npm global CLI path.
For environments that cannot access npm registries (e.g. corporate firewalls). Requires GitHub access.
Automated:
curl -fsSL https://github.com/kodrunhq/opencode-autopilot/releases/latest/download/install-local.sh | bashOr pin a specific version:
curl -fsSL https://github.com/kodrunhq/opencode-autopilot/releases/download/v1.20.0/install-local.sh | bash -s -- --version=1.20.0Manual:
- Download the latest
opencode-autopilot-local-vX.Y.Z.tar.gzand its.sha256file from GitHub Releases - Verify the checksum:
sha256sum -c opencode-autopilot-local-v*.tar.gz.sha256 - Extract and install:
mkdir -p ~/.config/opencode/plugins STAGING="$(mktemp -d ~/.config/opencode/plugins/.oca-staging-XXXXXX)" tar -xzf opencode-autopilot-local-v*.tar.gz -C "$STAGING" # Verify the bundle looks correct test -f "$STAGING/src/index.ts" && test -d "$STAGING/assets" && test -d "$STAGING/node_modules" # Rollback-safe swap (backup old, move staging in, remove backup) DEST=~/.config/opencode/plugins/opencode-autopilot BACKUP="" if [ -d "$DEST" ]; then BACKUP="$(mktemp -d ~/.config/opencode/plugins/.oca-backup-XXXXXX)" mv "$DEST" "$BACKUP/old" || { rm -rf "$STAGING" "$BACKUP"; exit 1; } fi mv "$STAGING" "$DEST" || { [ -n "$BACKUP" ] && mv "$BACKUP/old" "$DEST" 2>/dev/null rm -rf "$STAGING" "$BACKUP" exit 1 } [ -n "$BACKUP" ] && rm -rf "$BACKUP"
- Create a shim for auto-discovery:
echo 'export { default } from "./opencode-autopilot/src/index";' > ~/.config/opencode/plugins/opencode-autopilot.ts
Alternatively, add to your .opencode.json with an absolute path:
{
"plugin": ["file:///home/YOU/.config/opencode/plugins/opencode-autopilot/src/index.ts"]
}Replace /home/YOU with your actual home directory.
To verify a local installation, open OpenCode and run /oc-doctor in the chat.
Primary Tab-cycle agents provided by this plugin are:
autopilotcoderdebuggerplannerresearcherreviewer
OpenCode native plan and build are suppressed by the plugin config hook to avoid
duplicate planning/building entries in the primary Tab menu.
bunx @kodrunhq/opencode-autopilot doctorThis checks config health, model assignments, and adversarial diversity between agent groups.
For detailed guides on every subsystem, see the full documentation.
| Topic | Description |
|---|---|
| Installation Guide | Step-by-step setup with AI-guided or manual options |
| Architecture | System architecture and internal module map |
| The Pipeline | 8-phase autonomous development pipeline |
| Agents | Complete catalog of all agents and model groups |
| Code Review | 13-agent adversarial review pipeline |
| Configuration | v7 configuration schema reference |
| Tools Reference | Reference hub for the 39 registered oc_* tools |
| CLI Reference | install, configure, doctor, and inspect commands |
| Memory System | Dual-scope project and user preference memory |
| Model Fallback | Automatic fallback chains and session recovery |
| Background & Routing | Background tasks, category routing, and autonomy loop |
| Skills & Commands | Adaptive skills and slash commands |
| Observability | Event tracking, structured logging, and diagnostics |
Agents are organized into 8 groups by the type of thinking they do. Each group gets a primary model and fallback chain. Run bunx @kodrunhq/opencode-autopilot configure to assign models interactively. For advanced use, the oc_configure tool is also available in-session.
| Group | Agents | What they do | Model recommendation |
|---|---|---|---|
| Architects | oc-architect, oc-planner, autopilot | System design, planning, orchestration | Most powerful available |
| Challengers | oc-critic, oc-challenger | Challenge architecture, find design flaws | Strong model, different family from Architects |
| Builders | oc-implementer | Write production code | Strong coding model |
| Reviewers | oc-reviewer + 11 review agents | Find bugs, security issues, logic errors | Strong model, different family from Builders |
| Red Team | red-team, product-thinker | Final adversarial pass, hunt exploits | Different family from both Builders and Reviewers |
| Researchers | oc-researcher, researcher | Domain research, feasibility analysis | Good comprehension, any family |
| Communicators | oc-shipper, documenter, oc-retrospector | Docs, changelogs, lesson extraction | Mid-tier model |
| Utilities | oc-explorer, metaprompter, pr-reviewer | Fast lookups, prompt tuning, PR scanning | Fastest/cheapest available |
The installer warns when adversarial pairs share a model family:
| Pair | Relationship | Why diversity matters |
|---|---|---|
| Architects / Challengers | Challengers critique architect output | Same model = confirmation bias |
| Builders / Reviewers | Reviewers find bugs in builder code | Same model = shared blind spots |
| Red Team / Builders+Reviewers | Final adversarial perspective | Most effective as a third family |
{
"version": 7,
"configured": true,
"groups": {
"architects": { "primary": "anthropic/claude-opus-4-6", "fallbacks": ["openai/gpt-5.4"] },
"challengers": { "primary": "openai/gpt-5.4", "fallbacks": ["google/gemini-3.1-pro"] },
"builders": { "primary": "anthropic/claude-opus-4-6", "fallbacks": ["anthropic/claude-sonnet-4-6"] },
"reviewers": { "primary": "openai/gpt-5.4", "fallbacks": ["google/gemini-3.1-pro"] },
"red-team": { "primary": "google/gemini-3.1-pro", "fallbacks": ["openai/gpt-5.4"] },
"researchers": { "primary": "anthropic/claude-sonnet-4-6", "fallbacks": ["openai/gpt-5.4"] },
"communicators": { "primary": "anthropic/claude-sonnet-4-6", "fallbacks": ["anthropic/claude-haiku-4-5"] },
"utilities": { "primary": "anthropic/claude-haiku-4-5", "fallbacks": ["google/gemini-3-flash"] }
},
"overrides": {},
"background": {
"enabled": false,
"maxConcurrent": 5,
"persistence": true
},
"routing": {
"enabled": false,
"categories": {}
},
"recovery": {
"enabled": true,
"maxRetries": 3
},
"mcp": {
"enabled": false,
"skills": {}
}
}Per-agent overrides are supported for fine-grained control:
{
"overrides": {
"oc-planner": { "primary": "openai/gpt-5.4", "fallbacks": ["anthropic/claude-opus-4-6"] }
}
}opencode-autopilot runs an 8-phase autonomous pipeline, each driven by a specialized agent:
IDEA --> RECON --> CHALLENGE --> ARCHITECT --> EXPLORE --> PLAN --> BUILD --> SHIP --> RETROSPECTIVE
| | | | | | |
Research Enhance Multi-proposal Task plan Code + Docs & Extract
& assess the idea design arena with waves review changelog lessons
| Phase | Agent | What happens |
|---|---|---|
| RECON | oc-researcher | Domain research, feasibility assessment, technology landscape |
| CHALLENGE | oc-challenger | Proposes ambitious enhancements to the original idea |
| ARCHITECT | oc-architect x N | Multiple design proposals debated by oc-critic; depth scales with confidence |
| EXPLORE | local analysis | Scans the repository structure to surface risk areas, tech debt indicators, and implementation patterns |
| PLAN | oc-planner | Decomposes architecture into wave-scheduled tasks (max 300-line diffs each) |
| BUILD | oc-implementer | Implements code with inline review; 3-strike limit on CRITICAL findings |
| SHIP | oc-shipper | Generates walkthrough, architectural decisions, and changelog |
| RETROSPECTIVE | oc-retrospector | Extracts lessons, injected into future runs |
The ARCHITECT phase uses a multi-proposal arena. Based on confidence signals from RECON:
- High confidence -> 1 architecture proposal
- Medium confidence -> 2 competing proposals (simplicity vs. extensibility)
- Low confidence -> 3 competing proposals + critic evaluation
BUILD tracks implementation quality with a strike system:
- Each CRITICAL review finding = 1 strike
- 3 strikes -> pipeline stops (prevents infinite retry loops)
- Non-critical findings generate fix instructions without strikes
- Waves execute in order; a wave advances only after passing review
The oc_review tool provides a 4-stage multi-agent review pipeline:
Stage 1 -- Specialist dispatch: Auto-detects your stack from changed files and dispatches relevant agents in parallel.
Stage 2 -- Cross-verification: Each agent reviews other agents' findings to filter noise and catch missed issues.
Stage 3 -- Adversarial review: Red-team agent hunts for exploits; product-thinker checks for UX gaps.
Stage 4 -- Report or fix cycle: CRITICAL findings with actionable fixes trigger an automatic fix cycle; everything else lands in the final report.
| Category | Agents | When selected |
|---|---|---|
| Universal (always run) | logic-auditor, security-auditor, code-quality-auditor, test-interrogator, code-hygiene-auditor, contract-verifier | Every review |
| Stack-aware (auto-selected) | architecture-verifier, database-auditor, correctness-auditor, frontend-auditor, language-idioms-auditor | Based on changed file types |
| Sequenced (run last) | red-team, product-thinker | After all findings collected |
Review memory persists per project -- false positives are tracked and suppressed in future reviews (auto-pruned after 30 days).
When a model is rate-limited, unavailable, or returns an error, the fallback system automatically:
- Classifies the error (rate limit, auth, quota, service unavailable, etc.)
- Selects the next model from the group's fallback chain
- Replays the conversation on the new model
- Shows a toast notification (configurable)
- Recovers to the primary model after a cooldown period
The plugin ships with production-ready assets that auto-install to ~/.config/opencode/:
| Type | Assets | Purpose |
|---|---|---|
| Agents | researcher, metaprompter, documenter, pr-reviewer, autopilot + 10 pipeline agents | Research, docs, PR review, full autonomous pipeline |
| Commands | /oc-brainstorm, /oc-tdd, /oc-quick, /oc-write-plan, /oc-stocktake, /oc-review-pr, /oc-update-docs, /oc-new-agent, /oc-new-skill, /oc-new-command |
Brainstorming, TDD, quick tasks, planning, auditing, PR reviews, docs, extend plugin |
| Skills | coding-standards | Universal best practices (naming, error handling, immutability, validation) |
Extend the plugin without leaving OpenCode:
/oc-new-agent-- Create a custom agent with YAML frontmatter + system prompt/oc-new-skill-- Create a skill directory with domain knowledge/oc-new-command-- Create a slash command (validates against built-in names)
All created assets write to ~/.config/opencode/ and are available immediately after restarting the OpenCode session.
Config lives at ~/.config/opencode/opencode-autopilot.json. Run bunx @kodrunhq/opencode-autopilot configure for interactive setup, or edit manually.
| Setting | Options | Default |
|---|---|---|
orchestrator.autonomy |
full, supervised, manual |
full |
orchestrator.strictness |
strict, normal, lenient |
normal |
orchestrator.phases.* |
true / false |
all true |
confidence.thresholds.proceed |
HIGH, MEDIUM, LOW |
MEDIUM |
confidence.thresholds.abort |
HIGH, MEDIUM, LOW |
LOW |
fallback.enabled |
true / false |
true |
background.enabled |
true / false |
false |
background.maxConcurrent |
1-50 |
5 |
background.persistence |
true / false |
true |
routing.enabled |
true / false |
false |
routing.categories |
category override map | {} |
recovery.enabled |
true / false |
true |
recovery.maxRetries |
0-10 |
3 |
mcp.enabled |
true / false |
false |
mcp.skills |
skill-to-MCP config map | {} |
Config auto-migrates across schema versions (v1 -> v2 -> v3 -> v4 -> v5 -> v6 -> v7).
The plugin registers 39 tools, all prefixed with oc_ to avoid conflicts with OpenCode built-ins:
| Tool | Purpose |
|---|---|
oc_orchestrate |
Core pipeline state machine -- start a run or advance phases |
oc_review |
Multi-agent code review (4-stage pipeline) |
oc_configure |
Interactive model group assignment (start/assign/commit/doctor/reset) |
oc_state |
Query and patch pipeline state |
oc_phase |
Phase transitions and validation |
oc_confidence |
Confidence ledger management |
oc_plan |
Query task waves and status counts |
oc_forensics |
Diagnose pipeline failures (recoverable vs. terminal) |
oc_graph_index |
Index a local code graph for symbol and dependency lookup |
oc_graph_query |
Query the local code graph for definitions, imports, dependents, and outlines |
oc_create_agent |
Create custom agents in-session |
oc_create_skill |
Create custom skills in-session |
oc_create_command |
Create custom commands in-session |
oc_background |
Manage background tasks (spawn, monitor, cancel) |
oc_loop |
Start/stop autonomy loop with verification checkpoints |
oc_delegate |
Category-based task routing with skill injection |
oc_recover |
Session recovery with strategy selection |
oc_doctor |
Run plugin health diagnostics |
oc_quick |
Quick-mode pipeline bypass for trivial tasks |
oc_hashline_edit |
Hash-anchored line edits with FNV-1a verification |
oc_logs |
Query structured session logs |
oc_route |
Validate intent classification and return routing instructions |
oc_session_stats |
Session statistics and token usage |
oc_pipeline_report |
Generate pipeline execution report |
oc_summary |
Session summary generation |
oc_mock_fallback |
Fallback chain testing with mock providers |
oc_stocktake |
Audit installed assets (agents, skills, commands) |
oc_update_docs |
Detect docs affected by code changes |
oc_memory_status |
Memory system status and statistics |
oc_memory_preferences |
Manage user preference observations |
oc_memory_save |
Save durable project or user memories for later sessions |
oc_memory_search |
Search saved memories by text, kind, or topic |
oc_memory_forget |
Soft-delete obsolete or incorrect memories |
oc_lsp_goto_definition |
Jump to symbol definition via LSP |
oc_lsp_find_references |
Find all references to a symbol via LSP |
oc_lsp_symbols |
Document or workspace symbol search via LSP |
oc_lsp_diagnostics |
Get errors/warnings from language server |
oc_lsp_prepare_rename |
Check if a symbol can be renamed via LSP |
oc_lsp_rename |
Rename a symbol across the workspace via LSP |
src/
+-- index.ts Plugin entry -- registers tools, hooks, fallback handlers
+-- config.ts Zod-validated config with v1->v7 migration chain
+-- installer.ts Self-healing asset copier (COPYFILE_EXCL, overwrites installer-managed files only)
+-- registry/
| +-- types.ts GroupId, AgentEntry, GroupDefinition, DiversityRule, ...
| +-- model-groups.ts AGENT_REGISTRY, GROUP_DEFINITIONS, DIVERSITY_RULES
| +-- resolver.ts Model resolution: override > group > default
| +-- diversity.ts Adversarial diversity checker
| +-- doctor.ts Shared diagnosis logic (CLI + tool)
+-- tools/ Tool definitions (thin wrappers calling *Core functions)
+-- templates/ Pure functions: input -> markdown string
+-- review/ 13-agent review engine, stack gate, memory, severity
+-- orchestrator/
| +-- handlers/ Per-phase state machine handlers
| +-- fallback/ Model fallback: classifier, manager, state, chain resolver
| +-- artifacts.ts Phase artifact path management
| +-- lesson-memory.ts Cross-run lesson persistence
| +-- schemas.ts Pipeline state Zod schemas
+-- background/ Background task management with slot-based concurrency
| +-- database.ts SQLite persistence for task state
| +-- state-machine.ts Task lifecycle (queued -> running -> completed/failed)
| +-- slot-manager.ts Concurrent slot allocation and limits
| +-- executor.ts Task execution with timeout handling
| +-- manager.ts High-level API combining all background components
+-- autonomy/ Autonomy loop with verification checkpoints
| +-- state.ts Loop state tracking (iterations, context accumulation)
| +-- completion.ts Completion detection via positive/negative signals
| +-- verification.ts Post-iteration verification (tests, lint, artifacts)
| +-- controller.ts Loop lifecycle management (start, iterate, stop)
+-- routing/ Category-based task routing
| +-- categories.ts Category definitions with model and skill mappings
| +-- classifier.ts Intent classification from task descriptions
| +-- engine.ts Routing engine combining classification + delegation
+-- recovery/ Session recovery and failure resilience
| +-- classifier.ts Failure classification (transient, permanent, partial)
| +-- strategies.ts Recovery strategies (retry, fallback, checkpoint)
| +-- orchestrator.ts Strategy selection and execution
| +-- persistence.ts Checkpoint save/restore via SQLite
+-- context/ Context window management and injection
| +-- discovery.ts Active context discovery from session state
| +-- budget.ts Token budget allocation across injection sources
| +-- injector.ts System prompt injection orchestrator
| +-- compaction-handler.ts Context compaction when approaching limits
+-- ux/ User experience surfaces
| +-- notifications.ts Toast and inline notification system
| +-- progress.ts Progress tracking for multi-step operations
| +-- task-status.ts Task status formatting and display
| +-- context-warnings.ts Context usage warnings and suggestions
| +-- error-hints.ts Actionable error hints with fix suggestions
| +-- session-summary.ts End-of-session summary generation
+-- mcp/ MCP (Model Context Protocol) skill integration
| +-- types.ts MCP server and tool type definitions
| +-- manager.ts MCP config inventory and state tracking (no process lifecycle yet)
| +-- scope-filter.ts Scope-based MCP server filtering
+-- kernel/ Database primitives and concurrency
| +-- transaction.ts SQLite transactions with retry on SQLITE_BUSY
| +-- retry.ts Exponential backoff retry for busy errors
+-- logging/ Structured logging with sinks and rotation
+-- memory/ Smart dual-scope memory (project patterns + user preferences)
+-- observability/ Session observability and structured event logging
+-- skills/ Adaptive skill loading and injection
+-- health/ Plugin self-diagnostics
+-- utils/ Validators, paths, fs-helpers, gitignore management
bin/
+-- cli.ts CLI: install + doctor subcommands
Dependency flow (strictly top-down, no cycles):
index.ts -> tools/* -> registry/* + templates/* + utils/* + kernel/* -> Node built-ins + yaml
Key patterns:
- Declarative registry -- adding an agent = one line in AGENT_REGISTRY, everything derives from it
- Atomic writes -- all file operations use tmp + rename with crypto-random suffixes
- Immutable state -- deep-frozen data, spread operators, readonly arrays
- Bidirectional validation -- Zod parse on read AND write
- Self-healing assets -- bundled files copy on load, overwrite installer-managed files, never overwrite user customizations
git clone https://github.com/kodrunhq/opencode-autopilot.git
cd opencode-autopilot
bun install
bun test && bun run lint1790+ tests across 190 files. No build step -- Bun runs TypeScript natively.