subagent-fleet

Run Claude Code-style subagents across your local model fleet.

subagent-fleet is a config-first Python CLI for mapping coding subagents to the best Ollama model and machine you own, then generating LiteLLM and Claude Code-style agent configuration.

Quickstart • Configuration • Generated Files • Security • Roadmap

Overview

Local model users often have more than one useful machine: a laptop, a Mac mini, a workstation, a home server, or a spare GPU box. Most coding harnesses still point at one model endpoint.

subagent-fleet turns that setup into a private local subagent fleet:

planner     -> small fast model on a lightweight node
implementer -> larger coding model on a bigger node
reviewer    -> larger coding model on a bigger node
summarizer  -> small local model on the controller

It does not replace Ollama, LiteLLM, or Claude Code. It generates the glue between them:

Claude Code / coding harness
        |
        v
LiteLLM gateway generated by subagent-fleet
        |
        +-- Ollama node: laptop
        +-- Ollama node: Mac mini 64GB
        +-- Ollama node: workstation

Features

Validate a declarative fleet.yaml.
Discover models from configured Ollama nodes via /api/tags.
Generate litellm_config.yaml with ollama_chat/ routes.
Generate Claude Code-style .claude/agents/*.md files.
Generate .env.subagent-fleet for Claude Code/LiteLLM environment variables.
Warm configured Ollama models with keep_alive.
Show node health and agent routing tables.
Keep unreachable nodes isolated so one offline machine does not crash the whole workflow.

Status

MVP CLI implemented.

Available commands:

subagent-fleet init
subagent-fleet validate
subagent-fleet discover
subagent-fleet generate
subagent-fleet warmup
subagent-fleet status
subagent-fleet doctor
subagent-fleet clean
subagent-fleet skills list
subagent-fleet skills install
subagent-fleet plugins install

Install

Choose one of the install paths below.

CLI from GitHub

Install the CLI directly from PyPI:

python -m pip install subagent-fleet

Or install it as an isolated command with pipx:

pipx install subagent-fleet

Verify:

subagent-fleet --help

Development Checkout

Use this when contributing to the project:

git clone https://github.com/adityak74/subagent-fleet.git
cd subagent-fleet
python -m pip install -e ".[dev]"

Run tests:

python -m pytest

Claude Code Plugin First

Install the plugin first from Claude Code, then let the bundled bootstrap skill install the CLI:

/plugin marketplace add https://github.com/adityak74/subagent-fleet
/plugin install subagent-fleet

After install, ask Claude Code:

Use the subagent-fleet bootstrap skill to install the CLI and set up this repo.

The bootstrap skill will run or recommend:

python -m pip install subagent-fleet
subagent-fleet skills install

Codex Plugin First

Install this repository as a local Codex marketplace:

codex plugin marketplace add .
codex plugin add subagent-fleet@subagent-fleet

Then ask Codex:

Use the subagent-fleet bootstrap skill to install the CLI and set up this repo.

Quickstart

Create a starter config:

subagent-fleet init

Edit fleet.yaml with your Ollama node endpoints and model names, then validate it:

subagent-fleet validate

Check which nodes are reachable:

subagent-fleet discover

Generate LiteLLM, Claude agent, and environment files:

subagent-fleet generate

Start LiteLLM:

export LITELLM_MASTER_KEY="sk-local-dev"

litellm \
  --config ./litellm_config.yaml \
  --host 127.0.0.1 \
  --port 4000

Point Claude Code at the local gateway:

source .env.subagent-fleet
claude

Configuration

subagent-fleet is driven by fleet.yaml.

project:
  name: local-dev
  gateway:
    provider: litellm
    host: 127.0.0.1
    port: 4000
    master_key_env: LITELLM_MASTER_KEY

nodes:
  m5-local:
    endpoint: http://localhost:11434
    tags: [controller, local, fast]

  m4-mini-64gb:
    endpoint: http://192.168.1.50:11434
    tags: [heavy, coder, reviewer]

  m4-mini-16gb:
    endpoint: http://192.168.1.51:11434
    tags: [small, planner, summarizer]

models:
  heavy-coder:
    node: m4-mini-64gb
    ollama_model: qwen2.5-coder:32b
    litellm_alias: claude-sonnet-local
    context: 32768
    timeout: 600
    max_parallel: 1

  small-coder:
    node: m4-mini-16gb
    ollama_model: qwen2.5-coder:7b
    litellm_alias: claude-haiku-local
    context: 8192
    timeout: 300
    max_parallel: 1

agents:
  planner:
    model: small-coder
    description: Use for planning, file discovery, task decomposition, and summarization.
    tools: [Read, Grep, Glob]
    prompt: |
      You are a fast local planning agent.
      Do not edit files.
      Return a concise response with:
      - plan
      - relevant files
      - risks
      - next recommended agent

  implementer:
    model: heavy-coder
    description: Use for implementation, bug fixes, refactors, and patch creation.
    tools: [Read, Grep, Glob, Edit, MultiEdit, Bash]

  reviewer:
    model: heavy-coder
    description: Use after implementation to review diffs, tests, regressions, and maintainability.
    tools: [Read, Grep, Glob, Bash]

Generated Files

Running:

subagent-fleet generate

creates:

litellm_config.yaml
.claude/agents/planner.md
.claude/agents/implementer.md
.claude/agents/reviewer.md
.env.subagent-fleet

Example LiteLLM route:

model_list:
  - model_name: claude-sonnet-local
    litellm_params:
      model: ollama_chat/qwen2.5-coder:32b
      api_base: http://192.168.1.50:11434
      api_key: ollama
      timeout: 600
    model_info:
      max_input_tokens: 32768

Example Claude agent:

---
name: planner
description: Use for planning, file discovery, task decomposition, and summarization.
model: claude-haiku-local
tools: Read, Grep, Glob
---

You are a fast local planning agent.
Do not edit files.
Return a concise response with:
- plan
- relevant files
- risks
- next recommended agent

Commands

Command	Purpose
`subagent-fleet init`	Create a starter `fleet.yaml`.
`subagent-fleet validate`	Validate schema, references, URLs, aliases, and agent names.
`subagent-fleet discover`	Query configured Ollama nodes for available models.
`subagent-fleet generate`	Generate LiteLLM config, Claude agents, and env file.
`subagent-fleet warmup`	Preload configured Ollama models with `keep_alive`.
`subagent-fleet status`	Show node health and agent routing.
`subagent-fleet doctor`	Show validation and local-network safety guidance.
`subagent-fleet clean`	List or remove generated files.
`subagent-fleet skills list`	List bundled assistant skills and supported targets.
`subagent-fleet skills install`	Install assistant-facing setup and operations skills.
`subagent-fleet plugins install`	Install Claude Code and Codex plugin marketplace bundles.

JSON output is available for discovery and status:

subagent-fleet discover --json
subagent-fleet status --json

Assistant Skills

subagent-fleet ships assistant-facing skills that teach Claude Code, Codex, OpenCode, and similar tools how to set up and operate the fleet from inside a repository.

List bundled skills and supported targets:

subagent-fleet skills list

Install all bundled skills for all supported targets:

subagent-fleet skills install

This writes:

.claude/skills/subagent-fleet-setup/SKILL.md
.claude/skills/subagent-fleet-operations/SKILL.md
.codex/skills/subagent-fleet-setup/SKILL.md
.codex/skills/subagent-fleet-operations/SKILL.md
.opencode/skills/subagent-fleet-setup/SKILL.md
.opencode/skills/subagent-fleet-operations/SKILL.md

Install for a specific assistant:

subagent-fleet skills install --target codex
subagent-fleet skills install --target claude-code
subagent-fleet skills install --target opencode

Install one bundled skill:

subagent-fleet skills install --skill subagent-fleet-setup

Existing skill files are not overwritten unless you pass --force.

Plugin Marketplaces

This repository also ships plugin marketplace metadata so users can install the assistant skill first, then let that skill install and verify the Python CLI.

Included plugin artifacts:

.claude-plugin/marketplace.json
.agents/plugins/marketplace.json
plugins/subagent-fleet/.claude-plugin/plugin.json
plugins/subagent-fleet/.codex-plugin/plugin.json
plugins/subagent-fleet/skills/subagent-fleet-bootstrap/SKILL.md
plugins/subagent-fleet/skills/subagent-fleet-setup/SKILL.md
plugins/subagent-fleet/skills/subagent-fleet-operations/SKILL.md

The bootstrap skill teaches Claude Code or Codex how to install the CLI:

python -m pip install subagent-fleet

and then install repo-local assistant skills:

subagent-fleet skills install

Claude Code plugin install flow:

/plugin marketplace add https://github.com/adityak74/subagent-fleet
/plugin install subagent-fleet

Codex local marketplace flow:

codex plugin marketplace add .
codex plugin add subagent-fleet@subagent-fleet

To generate the same marketplace/plugin bundle into another directory:

subagent-fleet plugins install --out /path/to/marketplace-root

Install only one target:

subagent-fleet plugins install --target claude-code
subagent-fleet plugins install --target codex

Existing plugin marketplace files are not overwritten unless you pass --force.

Ollama Worker Setup

On each worker machine, run Ollama on a private interface reachable from your controller:

launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
launchctl setenv OLLAMA_KEEP_ALIVE "-1"
launchctl setenv OLLAMA_NUM_PARALLEL "1"
launchctl setenv OLLAMA_MAX_LOADED_MODELS "1"

killall Ollama
open -a Ollama

From the controller:

curl http://NODE_IP:11434/api/tags

Security

subagent-fleet assumes private local networking.

Do:

Use LAN, firewall rules, Tailscale, WireGuard, or a private subnet.
Keep LITELLM_MASTER_KEY set for LiteLLM access.
Treat generated .env.subagent-fleet files as local developer configuration.

Do not:

Expose Ollama directly to the public internet.
Expose LiteLLM without authentication.
Commit real API keys, LAN secrets, or machine-specific private .env files.

Run:

subagent-fleet doctor

for local setup and safety reminders.

Development

Install dev dependencies:

python -m pip install -e ".[dev]"

Run tests:

python -m pytest

Run a focused test:

python -m pytest tests/test_config.py

Check CLI wiring:

python -m subagent_fleet.cli --help

Project Layout

src/subagent_fleet/
  cli.py
  config.py
  discovery.py
  plugins.py
  warmup.py
  status.py
  skills.py
  generators/
  skill_templates/
  templates/

examples/
plugins/
tests/

Roadmap

MVP:

Dynamic routing by task type
Fallback model generation
Queue-aware scheduling
Agent execution trace viewer
Support for vLLM, LM Studio, llama.cpp, OpenRouter, and cloud APIs

Star History

Contributing

Issues and pull requests are welcome.

Good first areas:

More generator tests
Additional example fleets
Better status formatting
More robust Ollama error reporting
Documentation for real multi-machine setups

Before opening a PR:

python -m pytest

What This Is Not

subagent-fleet is not:

an inference engine
a replacement for Ollama
a replacement for LiteLLM
a model sharding framework
Kubernetes for local LLMs
a public model hosting platform

It is a small workflow layer for private local subagent orchestration.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.agents/plugins		.agents/plugins
.claude-plugin		.claude-plugin
.github/workflows		.github/workflows
docs		docs
examples		examples
plugins/subagent-fleet		plugins/subagent-fleet
src/subagent_fleet		src/subagent_fleet
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
ROADMAP_PROPOSAL.md		ROADMAP_PROPOSAL.md
SPEC.md		SPEC.md
implement_gen3_roadmap.py		implement_gen3_roadmap.py
implement_gen3_roadmap_fix.py		implement_gen3_roadmap_fix.py
implement_gen4_roadmap.py		implement_gen4_roadmap.py
implement_gen4_roadmap_fix.py		implement_gen4_roadmap_fix.py
implement_gen4_roadmap_fix2.py		implement_gen4_roadmap_fix2.py
implement_gen4_roadmap_fix3.py		implement_gen4_roadmap_fix3.py
implement_new_roadmap.py		implement_new_roadmap.py
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

subagent-fleet

Overview

Features

Status

Install

CLI from GitHub

Development Checkout

Claude Code Plugin First

Codex Plugin First

Quickstart

Configuration

Generated Files

Commands

Assistant Skills

Plugin Marketplaces

Ollama Worker Setup

Security

Development

Project Layout

Roadmap

Star History

Contributing

What This Is Not

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

subagent-fleet

Overview

Features

Status

Install

CLI from GitHub

Development Checkout

Claude Code Plugin First

Codex Plugin First

Quickstart

Configuration

Generated Files

Commands

Assistant Skills

Plugin Marketplaces

Ollama Worker Setup

Security

Development

Project Layout

Roadmap

Star History

Contributing

What This Is Not

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages