Skip to content

unohee/OpenSwarm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

331 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenSwarm

npm version npm downloads license SWE-bench Lite GitHub Discussions

Autonomous AI agent orchestrator — Codex, GPT, OpenRouter (any model), local models (Ollama/LM Studio), and Claude Code (claude -p)

💬 Help shape OpenSwarm. Share feature ideas, vote on the roadmap, and ask questions in GitHub Discussions. The roadmap is built in the open — your feedback decides what ships next.


OpenSwarm orchestrates multiple AI agents as autonomous code workers. It picks up issues from Linear or a built-in local tracker, runs Worker/Reviewer pair pipelines, reports through a pluggable notifier (Discord, Slack, Telegram, webhook), and retains long-term memory via LanceDB. Workers run on OpenAI Codex/GPT, any OpenRouter model, local open-source models (Ollama, LM Studio), or Claude Code (claude -p, opt-in) — with cost-aware routing measured on an L0–L6 benchmark ladder.

Verified on real GitHub issues: the agentic harness solves SWE-bench Lite instances graded by the official harness. Hybrid mode — a frontier model diagnoses read-only, a lightweight model implements with a verification loop — resolved 3/3 attempted instances that every single lightweight model had failed, at a fraction of frontier-only cost. Workers also learn each repository over time: task outcomes are stored as per-repo knowledge and recalled into future prompts. (benchmark rubric & results)

Quick Start

npm install -g @intrect/openswarm
openswarm init         # interactive setup wizard — provider auth + Linear OAuth + config
openswarm doctor       # verify your environment (runtime, native deps, providers, ports)
openswarm              # launches the TUI chat

openswarm init walks you through provider authentication, optional Linear OAuth (team/project picker), and writes a validated config.yaml. Prefer wiring a provider by hand? You need one first: openswarm auth login (ChatGPT OAuth, used by codex/gpt), openswarm auth login --provider openrouter (or export OPENROUTER_API_KEY=…), or just have an authenticated claude on PATH. Check what's wired with openswarm auth status, and diagnose any gaps with openswarm doctor.

What openswarm init sets up

The wizard asks three questions, detects what you already have, and writes the config for you:

  1. AI provider (worker/reviewer) — it auto-detects existing auth and offers inline login:
    • codex-responses — ChatGPT subscription via OAuth (Codex models, native loop) — easiest start
    • codex — external codex CLI · openrouter — any model (API key/OAuth) · gpt — OpenAI OAuth
    • lmstudio / local — local servers, no account · claudeclaude -p CLI (opt-in fallback)
  2. Task backendlocal SQLite issue store (no account) or linear (OAuth browser login or API key, then an arrow-key team → project picker for this repo)
  3. Notification channel (optional) — none / discord / slack / telegram / webhook

It then writes .env (secrets, chmod 600), config.yaml (validated), and — if you mapped a Linear project — openswarm.json (this repo → Linear team/project). Finally it prints next steps and can launch browser OAuth.

Re-running in a repo that already has config.yaml is refused unless you pass --force, and init refuses to overwrite a config.yaml that symlinks into the daemon's global config. For CI / non-interactive use, openswarm init --yes writes a sample config only.

TUI Chat Interface

TUI keyboard shortcuts

Key Action
Tab Switch tabs (Chat / Projects / Tasks / Stuck / Issues / Logs)
Enter Send message
Shift+Enter Newline
i Focus input
Esc Exit input focus
Ctrl+C Quit

Status bar shows: provider · model · message count · cumulative cost


CLI Commands

openswarm                        # TUI chat (default)
openswarm chat [session]         # Simple readline chat
openswarm resume                 # Reopen the most recent chat session (conversation + goal)
openswarm start                  # Start full daemon (requires config.yaml)
openswarm run "Fix the bug" -p ~/my-project   # Run a single task
openswarm exec "Run tests" --local --pipeline # Execute via daemon
openswarm init                   # Interactive setup wizard (provider auth, Linear OAuth, config)
openswarm doctor                 # Diagnose environment (runtime, native deps, providers, ports)
openswarm validate               # Validate config.yaml

# Code review
openswarm review                 # Review the working-tree changes
openswarm review --max           # Full-codebase audit: fan reviewer subagents over areas
                                 #   → report at .openswarm/audit/ + PM-synthesized Linear
                                 #   issues by default (≤10 cohesive, master + sub-issues)
openswarm review --max --fix     # after the audit, a worker per flagged area applies the
                                 #   fixes in the working tree (review the diff, then commit)
openswarm review --max --concurrency 8   # widen the fan-out — areas auto-split to fill the pool
                                 # more --max flags: --no-linear (report only) · --issues-per-area
                                 #   (legacy spray) · --issues <id> (set parent) · --fallback
                                 #   <adapter> · --out <file> · --dry-run (print the plan)

# CI / test gate auto-fix
openswarm fix                    # Run the checks (lint/typecheck/build/test from package.json),
                                 #   fan a fix-worker out over the failures, re-run until green
openswarm fix --checks lint,test # only these checks · --concurrency <n> · --rounds <n> (default 3)

# Code Registry & BS Detector
openswarm check --scan           # Scan repo → register all entities
openswarm check src/foo.ts       # File brief (entities, tests, risk)
openswarm check --bs             # BS pattern scan (bad code smells)
openswarm check --stats          # Registry statistics
openswarm check --high-risk      # High-risk entities
openswarm check --search "name"  # Full-text search
openswarm annotate "funcName" --deprecate "reason"
openswarm annotate "funcName" --tag "needs-refactor"
openswarm annotate "funcName" --warn "error/security: SQL injection"

openswarm exec options

Option Description
--path <path> Project path (default: cwd)
--timeout <seconds> Timeout in seconds (default: 600)
--local Execute locally without daemon
--pipeline Full pipeline: worker + reviewer + tester + documenter
--worker-only Worker only, no review
-m, --model <model> Model override for worker

Exit codes: 0 success · 1 failure · 2 timeout


Full Daemon Setup

For autonomous operation (Linear issue processing, Discord control, PR auto-improvement), you need a full config:

Prerequisites

  • Node.js >= 22
  • At least one LLM provider:
    • OpenAI Codexcodex-responses (ChatGPT OAuth, native loop, no extra binary) is the smoothest start; codex delegates to the external Codex CLI. openswarm auth login handles the ChatGPT OAuth
    • OpenRouter — any model; OPENROUTER_API_KEY or openswarm auth login --provider openrouter
    • OpenAI GPTopenswarm auth login --provider gpt
    • Local — LM Studio (lmstudio, :1234) or Ollama (local, :11434), auto-detected, no auth
    • Claude Code CLI (claude -p) — opt-in fallback; an authenticated claude on PATH
  • Native build toolchainbetter-sqlite3 and @lancedb/lancedb are native modules. Prebuilt binaries cover common platforms; if yours lacks one, npm install builds from source and needs python3 + a C/C++ toolchain (build-essential on Linux, Xcode Command Line Tools on macOS)
  • For autonomous mode only (optional): Linear — sign in with openswarm auth login --provider linear (OAuth PKCE) or use an API key + team ID; Discord bot token (message content intent); GitHub CLI (gh) for CI monitoring

Configuration

After the global install, run the wizard in the directory you want the daemon to manage — it writes everything for you:

openswarm init      # writes config.yaml + .env (provider, task backend, notifications)
openswarm doctor    # verify providers, native deps, ports

See What openswarm init sets up for the prompts. Prefer to edit by hand? config.yaml supports ${VAR} / ${VAR:-default} substitution (resolved from .env) and is validated with Zod. A minimal .env (the wizard writes only what your choices need):

LINEAR_API_KEY=your-linear-api-key      # or: openswarm auth login --provider linear
LINEAR_TEAM_ID=your-linear-team-id
DISCORD_TOKEN=your-discord-bot-token    # only if you chose the discord notifier
DISCORD_CHANNEL_ID=your-channel-id

Key configuration sections

Section Description
discord Bot token, channel ID, webhook URL
linear API key, team ID
github Repos list for CI monitoring
agents Agent definitions (name, projectPath, heartbeat interval)
autonomous Schedule, pair mode, role models, decomposition settings
prProcessor PR auto-improvement schedule, retry limits, conflict resolver config

CLI Adapter (Provider)

adapter: codex   # one of: codex · codex-responses · gpt · openrouter · lmstudio · local  (default: codex)

adapter accepts one of the six values below (validated by Zod). For a ChatGPT subscription, codex-responses is the smoothest first-run choice — it runs OpenSwarm's native loop over the Responses API with no extra binary. Switch at runtime via Discord, e.g. !provider codex-responses / !provider openrouter.

Adapter Backend Models Auth
codex-responses OpenAI Responses API (native loop, no CLI binary) gpt-5-codex (default), o3, o4-mini ChatGPT OAuth
codex OpenAI Codex CLI (delegated) gpt-5-codex (default), o3, o4-mini ChatGPT OAuth / codex CLI auth
gpt OpenAI Chat API gpt-4o (default), o3, … OAuth PKCE
openrouter OpenRouter API (native agentic loop) any OpenRouter model — gpt-5, gemini-2.5, deepseek, glm, qwen, … OPENROUTER_API_KEY or OAuth PKCE
lmstudio LM Studio (OpenAI-compatible, local) loaded LM Studio model (LMSTUDIO_MODEL) None
local Ollama (local, auto-detected) gemma, llama, qwen, mistral, … None

Claude Code (claude -p) is supported as an opt-in fallback (and powers the claude -p chat path) — install the claude CLI and authenticate it; openswarm init and openswarm doctor detect it. It is not a selectable adapter: value.

The openrouter adapter runs OpenSwarm's own agentic tool loop (read/search/edit/bash with verification guards), enables ZDR (data_collection: deny) for non-OpenAI models, and applies Anthropic prompt caching automatically. Local backends are auto-detected on standard ports (Ollama :11434, LM Studio :1234); use lmstudio for a dedicated LM Studio endpoint (LMSTUDIO_BASE_URL, default http://localhost:1234).

Per-role adapter overrides (each role may pick its own valid adapter + model):

autonomous:
  defaultRoles:
    worker:
      adapter: codex-responses
      model: gpt-5-codex
    reviewer:
      adapter: openrouter
      model: anthropic/claude-sonnet-4

Agent Roles

autonomous:
  defaultRoles:
    worker:
      model: gpt-5-codex
      escalateModel: openai/gpt-5     # escalate after repeated review failures
      escalateAfterIteration: 3
      timeoutMs: 1800000
    reviewer:
      model: gpt-5-codex
      timeoutMs: 600000
    tester:
      enabled: false
    documenter:
      enabled: false
    auditor:
      enabled: false

Running the daemon

With the global install, the openswarm CLI manages the daemon directly — no repo or npm run scripts needed:

openswarm start               # start the daemon in the background
openswarm start --foreground  # run attached (logs stream to the terminal)
openswarm status              # pid, uptime, log path
openswarm stop                # stop the daemon
openswarm dash                # open the web dashboard (:3847)

From source / development (contributors): clone the repo and use the npm run … scripts (npm run dev, npm start, npm run service:install for a macOS launchd service, docker compose up -d). See CONTRIBUTING.md.


Architecture

                         ┌──────────────────────────┐
                         │       Linear API          │
                         │   (issues, state, memory) │
                         └─────────────┬────────────┘
                                       │
                 ┌─────────────────────┼─────────────────────┐
                 │                     │                     │
                 v                     v                     v
  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐
  │ AutonomousRunner │  │  DecisionEngine  │  │  TaskScheduler   │
  │ (heartbeat loop) │─>│  (scope guard)   │─>│  (queue + slots) │
  └────────┬─────────┘  └──────────────────┘  └────────┬─────────┘
           │                                            │
           v                                            v
  ┌──────────────────────────────────────────────────────────────┐
  │                      PairPipeline                            │
  │  ┌────────┐   ┌──────────┐   ┌────────┐   ┌─────────────┐  │
  │  │ Worker │──>│ Reviewer │──>│ Tester │──>│ Documenter  │  │
  │  │(Adapter│<──│(Adapter) │   │(Adapter│   │  (Adapter)  │  │
  │  └───┬────┘   └──────────┘   └────────┘   └─────────────┘  │
  │      │  ↕ StuckDetector                                      │
  │  ┌───┴────────────────────────────────────────────────────┐  │
  │  │ Adapters: Codex | GPT | OpenRouter | Local (Ollama)   │  │
  │  └────────────────────────────────────────────────────────┘  │
  └──────────────────────────────────────────────────────────────┘
           │                     │                     │
           v                     v                     v
  ┌──────────────┐  ┌──────────────────┐  ┌──────────────────┐
  │  Discord Bot │  │  Memory (LanceDB │  │  Knowledge Graph │
  │  (commands)  │  │  + Xenova E5)    │  │  (code analysis) │
  └──────────────┘  └──────────────────┘  └────────┬─────────┘
                                                    │
                                           ┌────────┴─────────┐
                                           │  Code Registry   │
                                           │  (SQLite + FTS5) │
                                           │  + BS Detector   │
                                           └──────────────────┘

Features

  • Multi-Provider Adapters — Pluggable adapter system: OpenAI Codex/GPT, OpenRouter (any model, native agentic loop), local models (Ollama, LM Studio), and Claude Code (claude -p, opt-in) with runtime provider switching
  • Code Registry — SQLite-backed entity registry tracking every function/class/type across 8 languages, with complexity scoring, test mapping, and risk assessment
  • BS Detector — Built-in static analysis engine that detects bad code patterns (empty catch, hardcoded secrets, as any, etc.) with pipeline guard integration
  • Autonomous Pipeline — Cron-driven heartbeat fetches Linear issues, runs Worker/Reviewer pair loops, and updates issue state automatically
  • Worker/Reviewer Pairs — Multi-iteration code generation with automated review, testing, and documentation stages
  • Codebase Audit (review --max) — fans reviewer subagents out over directory-shaped areas (auto-split to fill --concurrency), aggregates a deduped verdict into a markdown report, and synthesizes ≤10 cohesive Linear issues via a PM agent. --fix sends a worker per flagged area to apply the fixes in the working tree. Language-agnostic; codex usage-limit aware with automatic claude fallback
  • CI / test gate auto-fix (openswarm fix) — runs the project's objective checks (lint / typecheck / build / test), groups the failures by file into areas, fans a fix-worker out over each, then re-runs the checks and repeats until green (or the round budget). Deterministic convergence — unlike the review fix pass, it verifies its own work
  • Decision Engine — Scope validation, rate limiting, priority-based task selection, and workflow mapping
  • Cognitive Memory — LanceDB vector store with Xenova/multilingual-e5-base embeddings for long-term recall across sessions
  • Repo Knowledge Loop — workers learn each repository over time: task outcomes (success patterns, review-rejection pitfalls) are stored per-repo and recalled into the next worker prompt
  • SWE-bench Verified — the agentic harness solves real SWE-bench Lite issues, graded by the official harness; hybrid mode (frontier diagnosis + lightweight implementer) resolved 3/3 attempted instances (benchmarks/RUBRIC.md)
  • Knowledge Graph — Static code analysis, dependency mapping, impact analysis, and file-level conflict detection across concurrent tasks
  • Discord Control — Full command interface for monitoring, task dispatch, scheduling, provider switching, and pair session management
  • Rich TUI Chat — Claude Code inspired terminal interface with tabs, streaming responses, and geek-themed loading messages
  • Dynamic Scheduling — Cron-based job scheduler with Discord management commands
  • PR Auto-Improvement — Monitors open PRs, auto-fixes CI failures, auto-resolves merge conflicts, and retries until all checks pass
  • Long-Running Monitors — Track external processes (training jobs, batch tasks) and report completion
  • Web Dashboard — Real-time pipeline stages, cost tracking, worktree status, and live logs on port 3847
  • Pace Control — 5-hour rolling window task caps, per-project limits, turbo mode, exponential backoff on failures
  • i18n — English and Korean locale support

How It Works

Linear (Todo/In Progress)
  → Fetch assigned issues
  → DecisionEngine filters & prioritizes
  → Resolve project path via projectMapper
  → PairPipeline.run()
    → Worker generates code (via the configured adapter)
    → Reviewer evaluates (APPROVE/REVISE/REJECT)
    → Loop up to N iterations
    → Optional: Tester → Documenter stages
  → Update Linear issue state (Done/Blocked)
  → Report to Discord
  → Save to cognitive memory

Memory System

Hybrid retrieval: 0.55 × similarity + 0.20 × importance + 0.15 × recency + 0.10 × frequency

Memory types: belief · strategy · user_model · system_pattern · constraint

Background: decay, consolidation, contradiction detection, distillation.

Repo knowledge loop — every completed task writes repo-scoped knowledge (success → system_pattern with files changed + approach, review rejection → constraint pitfall), and the next task on the same repo recalls the most relevant entries into the worker prompt as a "Repository Knowledge" section. Workers get better at a codebase the more they work on it. Workers can also actively query accumulated repo knowledge mid-task via the search_memory tool.

Benchmarks (L0–L6)

benchmarks/ contains a difficulty ladder for routing models by measured capability — synthetic L0–L5 tasks with deterministic grading, and L6 = real GitHub issues (SWE-bench Lite) solved by the OpenSwarm harness and graded by the official swebench harness. Headline: hybrid mode (frontier read-only diagnosis + lightweight implementer with a verification loop) resolved 3/3 attempted instances that every single lightweight model had failed. See benchmarks/RUBRIC.md for the rubric, measured results, and the harness defects the benchmark uncovered.


Discord Commands

Task Dispatch

Command Description
!dev <repo> "<task>" Run a dev task on a repository
!dev list List known repositories
!tasks List running tasks
!cancel <taskId> Cancel a running task

Agent Management

Command Description
!status Agent and system status
!pause <session> Pause autonomous work
!resume <session> Resume autonomous work
!log <session> [lines] View recent output

Linear Integration

Command Description
!issues List Linear issues
!issue <id> View issue details
!limits Agent daily execution limits

Autonomous Execution

Command Description
!auto Execution status
!auto start [cron] [--pair] Start autonomous mode
!auto stop Stop autonomous mode
!auto run Trigger immediate heartbeat
!approve / !reject Approve or reject pending task

Worker/Reviewer Pair

Command Description
!pair Pair session status
!pair start [taskId] Start a pair session
!pair run <taskId> [project] Direct pair run
!pair stop [sessionId] Stop a pair session
!pair history [n] View session history
!pair stats View pair statistics

Scheduling

Command Description
!schedule List all schedules
!schedule run <name> Run a schedule immediately
!schedule toggle <name> Enable/disable a schedule
!schedule add <name> <path> <interval> "<prompt>" Add a schedule
!schedule remove <name> Remove a schedule

Other

Command Description
!ci GitHub CI failure status
!provider <codex|codex-responses|openrouter|gpt|lmstudio|local> Switch CLI provider at runtime
!codex Recent session records
!memory search "<query>" Search cognitive memory
!help Full command reference

Project Structure

src/
├── index.ts                 # Entry point
├── cli.ts                   # CLI entry point (run, exec, chat, init, validate, start)
├── cli/                     # CLI subcommand handlers
│   └── promptHandler.ts     # exec command: daemon submit, auto-start, polling
├── core/                    # Config, service lifecycle, types, event hub
├── adapters/                # Provider adapters (codex, codex-responses, gpt, openrouter, local, lmstudio), agentic loop
├── agents/                  # Worker, reviewer, tester, documenter, auditor
│   ├── pairPipeline.ts      # Worker → Reviewer → Tester → Documenter pipeline
│   ├── agentBus.ts          # Inter-agent message bus
│   └── cliStreamParser.ts   # Claude CLI output parser
├── orchestration/           # Decision engine, task parser, scheduler, workflow
├── automation/              # Autonomous runner, cron scheduler, PR processor
├── memory/                  # LanceDB + Xenova embeddings cognitive memory
├── knowledge/               # Code knowledge graph (scanner, analyzer, graph)
├── registry/                # Code entity registry, BS detector, entity scanner
├── issues/                  # Local issue tracker (SQLite + GraphQL + Kanban UI)
├── discord/                 # Bot core, command handlers, pair session UI
├── linear/                  # Linear SDK wrapper, project updater
├── github/                  # GitHub CLI wrapper for CI monitoring
├── support/                 # Web dashboard, planner, rollback, git tools
├── locale/                  # i18n (en/ko) with prompt templates
└── __tests__/               # Vitest test suite

State & Data

Path Description
~/.openswarm/ State directory (memory, codex, metrics, workflows)
~/.openswarm/registry.db Code entity registry (SQLite)
~/.openswarm/issues.db Local issue tracker (SQLite)
~/.claude/openswarm-*.json Pipeline history and task state
~/.config/openswarm/telemetry.json Anonymous install id + opt-out notice flag
config.yaml Main configuration
dist/ Compiled output

Privacy & Telemetry

OpenSwarm collects anonymous, opt-out usage telemetry to understand how it's actually used (npm downloads and GitHub stars don't tell us). A single event is sent per command invocation.

What is sent (and nothing else):

Field Example Why
Random install id V1StGXR8_Z5j... (nanoid) De-duplicate installs — anonymous, local-only
Command start, run, chat Which features are used
Version 0.9.3 Version adoption
OS / arch darwin / arm64 Platform support priorities
Node version 22.3.0 Runtime support
Adapter family codex Which providers are popular
Error flag 0 / 1 Failure rate (boolean only)

Never sent: source code, prompts, file paths, repo or issue names, Linear/Discord content, environment variables, API keys, or any personal data.

How to opt out (any one):

export OPENSWARM_TELEMETRY=0      # or DO_NOT_TRACK=1
# config.yaml
telemetry:
  enabled: false

CI environments (CI / GITHUB_ACTIONS) are excluded automatically. The collector is a Cloudflare Worker writing to a private D1 table; telemetry never blocks or slows the CLI (fire-and-forget with a short timeout, and failures are silently ignored).

Tech Stack

Category Technology
Runtime Node.js 22+ (ESM)
Language TypeScript (strict mode)
Build tsc
Agent Execution Claude Code, OpenAI GPT/Codex, Ollama/LMStudio/llama.cpp
Local DB better-sqlite3 (WAL mode, FTS5)
Task Management Linear SDK (@linear/sdk)
Communication Discord.js 14
Vector DB LanceDB + Apache Arrow
Embeddings Xenova/transformers (multilingual-e5-base, 768D)
Scheduling Croner
Config YAML + Zod validation
Linting oxlint
Testing Vitest

Changelog

Full version history lives in CHANGELOG.md and the GitHub Releases page.

Latest — v0.14.0: auto-release on version bump (merge → npm publish + tag + GitHub release) and a CLI update notifier. See CHANGELOG.md for the rest.


Troubleshooting

Korean / multibyte input doubles over mobile SSH (e.g. Termius)

If the chat TUI shows each Hangul (or other multibyte) character twice — 이이렇렇게 쓰쓰이는것 — while ASCII characters look fine, the cause is almost always client-side local / predictive echo in the mobile SSH app drawing an extra copy of wide characters. The keystroke reaches OpenSwarm once; the terminal paints it twice.

Fix it in the SSH client:

  • Termius → Host/Terminal settings → turn Local Echo (a.k.a. predictive echo) off, and ensure the encoding is UTF-8.

  • Confirm the server side is fine by running with diagnostics:

    OPENSWARM_DEBUG_INPUT=1 openswarm chat

    Type a few Korean characters, then inspect ~/.openswarm/input-debug.log. If a single keypress logs one code point (cp=[51060]) but you saw two glyphs, the doubling is terminal echo (client-side). If it logs the code point twice in one event, it's an app-level issue — please attach the log to a bug report.


Contributing

Contributions are welcome — OpenSwarm is MIT-licensed and accepts pull requests from anyone. See CONTRIBUTING.md for development setup, the local check gates, branch/commit conventions, and the PR process. By participating you agree to the Code of Conduct.

  • 🐛 Report a bug
  • 💡 Share an idea — the roadmap is built in the open
  • 🔧 Fork the repo, branch from main, and open a PR (CI runs lint → typecheck → build → test)
  • 🔒 Found a security issue? See SECURITY.md — please don't file it publicly

License

MIT © Heewon Oh

About

OpenSwarm — Autonomous AI dev team orchestrator powered by Claude Code CLI. Discord control, Linear integration, cognitive memory.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

Contributors