Skip to content

raindragon14/phi-gateway

Repository files navigation

PhiGateway

Self-hosted AI gateway. LLM proxy, tool registry, RAG knowledge base, and agent memory behind one OpenAI-compatible endpoint.

CI Python 3.12+ License: MIT

Install

pip install phi-gateway

Quick Start

# Start the gateway
uvicorn phi_gateway.main:app

# Create an API key
curl -sX POST http://localhost:8000/v1/keys \
  -H "Content-Type: application/json" \
  -d '{"name":"my-agent","tier":"free"}'

# Chat through the gateway
curl -s http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer phi-sk-..." \
  -H "Content-Type: application/json" \
  -d '{"model":"groq/llama-3.3-70b","messages":[{"role":"user","content":"Hello"}]}'

With Docker:

git clone https://github.com/raindragon14/phi-gateway
cd phi-gateway
cp .env.example .env    # add your provider keys
docker compose up -d

What It Does

LLM Proxy — Route to OpenAI, Anthropic, Groq, or OpenRouter. Switch providers or use fallback chains without changing agent code. Streaming, cost tracking, and logging included.

Tool Registry — Register tools with JSON Schema. Agents discover and call them via REST or MCP (JSON-RPC 2.0). MCP-native.

Knowledge Base — Chunk, embed, and search documents. Cosine similarity with keyword fallback. Everything in SQLite. No external vector database.

Agent Memory — Store conversations, paginate history, auto-trim context. Returns X-Context-Truncated header when messages are trimmed.

Usage

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="phi-sk-...")
response = client.chat.completions.create(
    model="groq/llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello"}],
)

Full API reference at /docs when the server is running.

Architecture

Caddy (reverse proxy, auto TLS)
  └── FastAPI (uvicorn)
        ├── /v1/chat/completions  →  LLM proxy  →  provider APIs
        ├── /v1/tools             →  tool registry
        ├── /v1/kb                →  RAG (SQLite + cosine similarity)
        ├── /v1/memory            →  agent memory
        ├── /mcp                  →  JSON-RPC 2.0 (MCP)
        └── /dashboard            →  HTMX admin UI
              └── SQLite (single file)

Idle RAM: ~250 MB. Python 3.12+. MIT license.

Development

pip install -e ".[dev]"
pytest -v
ruff check src/ tests/

Code style: Google docstrings, ruff format, pytest. See pyproject.toml.

License

MIT

About

Self-hosted AI gateway. LLM proxy, MCP tool registry, RAG knowledge base, agent memory.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages