Filo

Filo is a high-performance AI coding assistant written in modern C++.

It runs in multiple runtime modes:

interactive terminal app (TUI)
non-interactive prompter mode for scripts/CI
MCP server over stdio
HTTP daemon exposing MCP and/or OpenAI/Anthropic-compatible API endpoints

Why Filo

Filo focuses on speed, control, and local-first workflows without giving up multi-provider flexibility.

What Is Different In Filo

1) Local-first architecture

Local providers are first-class: Ollama over localhost and embedded llama.cpp for in-process GGUF inference( FILO_ENABLE_LLAMACPP=ON).
Router guardrails can exempt providers flagged as local (enforce_on_local: false), keeping embedded local backends available when remote limits are hit.
The daemon listens on 127.0.0.1 by default.

2) Embedded smart routing

In-process router engine with policy rules and strategies: smart, fallback, latency, load_balance.
Automatic fallback chains with per-candidate retries.
Guardrails for spend and quota reserves (max_session_cost_usd, token/request/window reserve ratios).
Auto-classifier that scores prompt complexity and routes to fast/balanced/powerful tiers.

3) Embedded Python runtime

Built-in python tool executes code inside an embedded interpreter.
Interpreter state persists across calls (variables/imports/functions carry over).
Optional venv isolation via FILO_PYTHON_VENV.

Feature Highlights

C++26 core with streaming-first provider protocols
TUI built with FTXUI
Context mentions (@file, quoted paths, and escaped paths like @My\ Folder/file.txt)
Agent Skills support with .filo/skills and on-demand activation
Ctrl+V clipboard paste support (text paste and clipboard-image insertion as @"<path>")
Session persistence and resume
Global + workspace config layering
MCP dispatcher shared across stdio and HTTP transports
OAuth and API-key credentials

Prerequisites

CMake >= 3.28
C++26 compiler (GCC 15+ or Clang 17+ recommended)
OpenSSL
Python 3 (required when FILO_ENABLE_PYTHON=ON, which is the default)

Build And Run

Linux

cmake --preset linux-debug
cmake --build --preset linux-debug
ctest --preset linux-debug --output-on-failure
./build/Linux/linux-debug/filo

macOS

cmake --preset xcode-debug
cmake --build --preset xcode-debug
ctest --preset xcode-debug --output-on-failure
./build/Darwin/xcode-debug/Debug/filo

Enable Embedded `llama.cpp`

Linux:

cmake --preset linux-debug -DFILO_ENABLE_LLAMACPP=ON
cmake --build --preset linux-debug

Minimal local provider example:

{
  "default_provider": "local",
  "providers": {
    "local": {
      "api_type": "llamacpp",
      "model": "qwen2.5-coder-7b",
      "model_path": "/absolute/path/to/model.gguf",
      "context_size": 8192,
      "gpu_layers": 35
    }
  }
}

Runtime Modes

Mode	Command
Interactive TUI	`filo`
Prompter (single-shot)	`filo --prompt "Summarize this diff"`
MCP over stdio	`filo --mcp --headless` or `filo --mcp stdio --headless`
MCP over TCP (HTTP `/mcp` endpoint)	`filo --mcp tcp --headless --port 8080`
API gateway only	`filo --api --headless --port 8080`
MCP + API gateway	`filo --mcp tcp --headless --api --port 8080`

Daemon transport notes:

--mcp without a value defaults to stdio.
--mcp tcp starts the HTTP daemon and exposes MCP on /mcp.
--daemon is still accepted as a deprecated alias for --mcp tcp.
Set FILO_MCP_BEARER_TOKEN to require Authorization: Bearer <token> on /mcp.
For LAN worker deployments, use --host 0.0.0.0 only with a bearer token and network access controls.
The API gateway is off by default to keep daemon startup minimal and local-first.
--api starts the same HTTP daemon and exposes OpenAI/Anthropic-compatible proxy endpoints:
- GET /v1/models
- POST /v1/chat/completions (OpenAI-style)
- POST /v1/messages (Anthropic-style)
Combine --api with --mcp tcp if you want both /mcp and /v1/* on one port.
Model routing in API gateway endpoints:
- policy/<policy_name> routes via filo smart router policy.
- <provider>/<model> routes directly to a configured provider/model.
- <provider> routes to that provider's default configured model.

Useful CLI flags:

--mcp [stdio|tcp] run as MCP server (default transport: stdio)
--daemon deprecated alias for --mcp tcp
--api enable optional OpenAI/Anthropic-compatible proxy mode
--login <provider> authenticate and exit (openai uses ChatGPT OAuth)
--list-sessions list resumable sessions
--resume [id|index] resume a saved session
--prompter force non-interactive mode
--prompt, -p prompt text
--output-format, -o one of text, json, stream-json
--input-format one of text, stream-json
--include-partial-messages include deltas in stream-json
--continue continue the latest project-scoped session in prompter mode

Prompter examples:

# Direct prompt
filo --prompt "Review this patch for regressions"

# Stdin only
git diff | filo

# Prompt + stdin context
cat README.md | filo --prompt "Summarize the key setup steps"

# JSON output for automation
filo -p "Generate release notes from these commits" -o json

# Stream JSON events
filo -p "Explain the architecture" -o stream-json --include-partial-messages

# Continue latest project-scoped session
filo --continue -p "Now apply the follow-up refactor"

Provider Setup

Filo supports both API-key and OAuth-based providers.

Typical API-key setup:

export XAI_API_KEY="..."
export OPENAI_API_KEY="..."
export ANTHROPIC_API_KEY="..."
export GEMINI_API_KEY="..."
export MISTRAL_API_KEY="..."
export KIMI_API_KEY="..."
export ZAI_API_KEY="..."
export DASHSCOPE_API_KEY="..."

For local Ollama, default endpoint is:

http://localhost:11434

Configuration

Config files are layered in this order:

~/.config/filo/config.json
~/.config/filo/auth_defaults.json
~/.config/filo/settings.json
./.filo/config.json
./.filo/settings.json
~/.config/filo/profile_defaults.json
~/.config/filo/model_defaults.json

Use config.json for providers/router/subagents. Use settings.json for managed UI/workflow preferences.

Optional context compression can be enabled in config.json:

{
  "context_compression": "light"
}

light keeps exact small tool results, but stores oversized read_file and shell outputs in the agent history as compact summaries with size, digest, signal lines, and head/tail excerpts. The default is off.

Use "full" for the native context-cache mode: first or changed read_file calls stay exact, repeated unchanged reads collapse to small path-based cache stubs, instruction files are always preserved, and common shell outputs such as git status, git diff, build, and test logs are retained as signal-focused summaries. Diff summaries preserve structural diff lines verbatim.

Use "ultra" when token pressure matters more than keeping broad excerpts: it applies the same native cache and command-family summaries as full, but with tighter budgets for read_file and shell output. Instruction files are still preserved exactly.

In the interactive TUI, use /compression to pick a mode from a menu, or /compression ultra / /compress ultra to switch directly.

Profiles

Profiles let you keep multiple named configuration overlays and switch between them instantly. This is useful for context switching (for example: work, oss, local), without rewriting your main config each time.

What profiles support:

Define named overlays under profiles in config.json
Inherit from one or more parent profiles with extends_from
Override normal config fields (provider/model selection, mode, approval mode, router, MCP servers, subagents, UI defaults)
Switch in TUI with /profile <name> and apply changes live in the current session
Persist the active profile in ~/.config/filo/profile_defaults.json for future launches

Example profile config:

{
  "profiles": {
    "work": {
      "description": "Company defaults",
      "default_provider": "openai",
      "default_mode": "BUILD"
    },
    "oss": {
      "extends_from": ["work"],
      "default_provider": "grok",
      "default_approval_mode": "prompt"
    }
  }
}

Quick usage:

Define profiles under profiles in ~/.config/filo/config.json or ./.filo/config.json.
In TUI, run /profile (or /profile list) to see active and available profiles.
Switch profile with /profile <name> (for example /profile work).
Remove the persisted profile with /profile clear.

Commands:

/profile
/profile list
/profile work
/profile oss
/profile clear

Precedence note: FILO_PROFILE=<name> forces a profile for that process and overrides the persisted selection until unset.

Agent Skills

Filo supports Agent Skills-style instruction packages. Each skill is a directory with a SKILL.md file containing YAML frontmatter with at least name and description, followed by Markdown instructions.

The official project-local location is ./.filo/skills/<name>/SKILL.md. Filo-native skill roots have precedence over compatibility roots such as .claude/skills and .agents/skills; project-local .filo/skills has the highest precedence. Use .filo/skills for skills that are specific to Filo or this repository.

Instruction skills are disclosed to the model as a compact catalog and loaded on demand through the activate_skill tool. Bundled scripts/, references/, and assets/ files are listed during activation and can be read by calling activate_skill again with resource_path.

To use the public Agent Skills collection in one project:

mkdir -p .filo
git clone https://github.com/addyosmani/agent-skills.git .filo/agent-skills
ln -s agent-skills/skills .filo/skills

To install it globally instead:

mkdir -p ~/.config/filo
git clone https://github.com/addyosmani/agent-skills.git ~/.config/filo/agent-skills
ln -s agent-skills/skills ~/.config/filo/skills

If .filo/skills or ~/.config/filo/skills already exists, copy individual skill directories into it instead of replacing the directory.

After installation, start Filo from the project. The model will see skill names such as using-agent-skills, code-review-and-quality, and frontend-ui-engineering in its system catalog and should call activate_skill before using the matching workflow. When a skill references a bundled resource, for example scripts/idea-refine.sh, the model can load it with activate_skill and the same resource_path.

Skills without entry_point are also available as slash commands in the TUI: /<skill-name> [arguments]. The body may use $ARGUMENTS as a placeholder. Filo-specific Python tool skills continue to use entry_point and the existing get_schema() / execute() Python contract.

Smart router with local-first policy example

{
  "router": {
    "enabled": true,
    "default_policy": "local-first",
    "guardrails": {
      "max_session_cost_usd": 5.0,
      "min_requests_remaining_ratio": 0.20,
      "min_tokens_remaining_ratio": 0.20,
      "min_window_remaining_ratio": 0.20,
      "enforce_on_local": false
    },
    "policies": {
      "local-first": {
        "strategy": "fallback",
        "defaults": [
          { "provider": "local", "model": "qwen2.5-coder-7b", "retries": 0 },
          { "provider": "ollama", "model": "llama3", "retries": 0 },
          { "provider": "grok", "model": "grok-code-fast-1", "retries": 1 }
        ],
        "rules": [
          {
            "name": "deep-reasoning",
            "priority": 10,
            "strategy": "fallback",
            "when": {
              "min_prompt_chars": 260,
              "any_keywords": ["debug", "root cause", "architecture", "migration"]
            },
            "candidates": [
              { "provider": "claude", "model": "claude-sonnet-4-6", "retries": 1 },
              { "provider": "grok-reasoning", "model": "grok-4.20-reasoning", "retries": 1 }
            ]
          }
        ]
      }
    }
  }
}

Architecture Snapshot

src/core/llm/ provider abstraction, protocols, routing
src/core/tools/ tool execution (shell/files/patch/search/python)
src/core/mcp/ MCP dispatcher and client/session handling
src/tui/ terminal UI components
src/exec/ stdio MCP server, daemon, and prompter entrypoints
src/core/auth/ API key and OAuth flows

License

Apache License 2.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
.github/workflows		.github/workflows
.vscode		.vscode
cmake		cmake
docs		docs
src		src
tests		tests
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Filo

Why Filo

What Is Different In Filo

1) Local-first architecture

2) Embedded smart routing

3) Embedded Python runtime

Feature Highlights

Prerequisites

Build And Run

Linux

macOS

Enable Embedded `llama.cpp`

Runtime Modes

Provider Setup

Configuration

Profiles

Agent Skills

Smart router with local-first policy example

Architecture Snapshot

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Filo

Why Filo

What Is Different In Filo

1) Local-first architecture

2) Embedded smart routing

3) Embedded Python runtime

Feature Highlights

Prerequisites

Build And Run

Linux

macOS

Enable Embedded llama.cpp

Runtime Modes

Provider Setup

Configuration

Profiles

Agent Skills

Smart router with local-first policy example

Architecture Snapshot

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Enable Embedded `llama.cpp`

Packages