The fastest way to add SSO, audit logs, guardrails, and routing to any AI app.
OpenAI-compatible. Every feature free. MIT, forever.
Docs Β· Quick Start Β· Drop-in Replacement Β· Providers Β· ADRs
| CI/CD | |
| Docs | |
| Package | |
| Meta |
Every AI gateway on the market takes the same bet: lock the good stuff behind an enterprise license.
| Open source | SSO | Audit logs | Guardrails | Advanced routing | License | |
|---|---|---|---|---|---|---|
| LiteLLM | MIT | β | β | basic | basic | Commercial for the rest |
| Bifrost | Apache 2.0 | β | β | β | basic | Enterprise for the rest |
| Portkey | AGPL | β | β | β | β | Source-available |
| OpenGateway | MIT | β | β | β | β | MIT forever |
LiteLLM charges for SSO and audit logs. Bifrost gates guardrails and clustering behind enterprise. OpenGateway gives you everything in the OSS build, funded by managed hosting and support, the same model Red Hat used with Backstage.
"Bifrost is Go. OpenGateway is Python/Mojo, easier to customise." internal positioning line.
And unlike every other Python AI gateway, OpenGateway ships a second server on Mojo + flare for when you want a single static binary at the edge.
$ uv pip install -e ".[dev]"
$ cp .env.example .env && $EDITOR .env # set ROOT_KEY and OPENAI_API_KEY
$ opengateway
INFO: Uvicorn running on http://0.0.0.0:8080$ curl -fsSL https://pixi.sh/install.sh | sh # one-time
$ pixi install -e mojo
$ pixi run -e mojo mojo run opengateway/mojo/main.mojo
opengateway (mojo): listening on 0.0.0.0:8080 with 4 workersBoth servers implement the same POST /v1/chat/completions endpoint and share the same provider adapters. Switch via deployment shape, not via code.
$ curl -s http://localhost:8080/health
{"status":"ok"}
$ curl -s -X POST http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer $ROOT_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Say hi in five languages"}]
}' | jq '.choices[0].message.content'
"Hello, Hola, Bonjour, Hallo, γγγ«γ‘γ―"Point any OpenAI-compatible client at OpenGateway with one line. Same API, same SDK, no code changes.
# Python (openai SDK)
- client = OpenAI(api_key="sk-...")
+ client = OpenAI(base_url="http://localhost:8080/v1", api_key="sk-og-...")
# TypeScript (openai SDK)
- const client = new OpenAI({ apiKey: "sk-..." })
+ const client = new OpenAI({ baseURL: "http://localhost:8080/v1", apiKey: "sk-og-..." })
# curl
- curl https://api.openai.com/v1/chat/completions ...
+ curl http://localhost:8080/v1/chat/completions ...Every OpenAI SDK works without modification. Anthropic, Google GenAI, LiteLLM, and LangChain SDKs work the same way through the provider router.
- OpenAI-compatible endpoint. Same request shape, same response shape, same error format. Drop-in for every OpenAI client SDK.
- Multi-provider support. OpenAI, Anthropic, AWS Bedrock, Google Vertex, Azure, and more, behind a single API.
- Automatic fallbacks. Seamless failover between providers and models with zero downtime.
- Load balancing. Intelligent request distribution across multiple API keys and providers.
- Streaming (SSE). Server-sent events work out of the box, same wire format as OpenAI.
- Virtual keys with model allow-lists. Per-key permissions restrict which models each caller can hit.
- Per-key budgets and rate limits. Token-bucket rate limiting and USD budget caps per virtual key.
- Audit logs. Structured events for every request, queryable by key, model, and time range.
- SSO / SAML. OIDC and SAML authentication for admin interfaces and key management.
- RBAC. Team, organisation, and admin role hierarchy.
- IP ACLs. Restrict access to specific source IPs or CIDR ranges.
- Custom branding. Logo, colours, and per-tenant theming on the admin UI.
- Native Prometheus metrics. Request counts, latency histograms, error rates, token usage.
- Distributed tracing. OpenTelemetry-compatible export to Jaeger, Tempo, or Honeycomb.
- Structured request logging. JSON logs with key, model, latency, tokens, and cost.
- Zero-config startup. Single command, single binary, no database required to start.
- Single static binary (Mojo path) for serverless, edge, and Lambda deployments.
- Conventional commits drive automated versioning and changelog generation via release-please.
- Standard formats everywhere. OpenAPI for the spec, JSON for the config, dotenv for the env.
| Provider | Status | Routing prefixes | Key env var |
|---|---|---|---|
| OpenAI | shipped | gpt-*, openai/* |
OPENAI_API_KEY |
| Anthropic | routed, adapter pending | claude-*, anthropic/* |
ANTHROPIC_API_KEY |
| AWS Bedrock | routed, adapter pending | bedrock/*, amazon.* |
(AWS credentials) |
| Azure OpenAI | planned | azure/* |
AZURE_OPENAI_API_KEY |
| Google Vertex | planned | vertex/* |
(GCP credentials) |
| vLLM / local | planned | local/* |
(none) |
Adding a provider takes three steps: implement BaseProvider, add a routing rule, configure the key.
OpenGateway is a drop-in replacement for any OpenAI-compatible client. Tested with:
| SDK | Status | Notes |
|---|---|---|
| openai-python | works | Set base_url to the gateway URL |
| openai-node | works | Set baseURL to the gateway URL |
| anthropic-sdk-python | works | Via provider router |
| Google GenAI | works | Via provider router |
| LiteLLM SDK | works | Nested routing for migration |
| LangChain | works | Use OpenAI-compatible endpoint |
Everything is environment variables or .env:
| Variable | Default | Description |
|---|---|---|
ROOT_KEY |
sk-root-change-me |
Admin key with full access. Replace before deploying. |
OPENAI_API_KEY |
(unset) | Upstream key for gpt-* and openai/*. |
ANTHROPIC_API_KEY |
(unset) | Upstream key for claude-* and anthropic/*. |
DATABASE_URL |
postgresql://... |
Tenants, keys, audit logs. |
REDIS_URL |
redis://... |
Rate limits and short-lived caches. |
HOST |
0.0.0.0 |
Bind address. |
PORT |
8080 |
Bind port. |
WORKERS |
1 |
uvicorn workers (Python only). |
DEBUG |
false |
Reload on file changes. |
LOG_LEVEL |
INFO |
DEBUG / INFO / WARNING / ERROR. |
REQUIRE_AUTH |
true |
Reject requests without a valid Authorization header. |
API keys follow sk-og-{token}, the same prefix shape as OpenAI, branded to OpenGateway.
OpenAI-compatible client
β
βΌ
ββββββββββββββββββββββ
β HTTP API surface β β FastAPI (Python, default)
β β OR flare (Mojo, opt-in)
βββββββββββ¬βββββββββββ
β
βΌ
ββββββββββββββββββββββ
β PythonObject β β only in the Mojo path
β bridge β
βββββββββββ¬βββββββββββ
β
ββͺβ single sync function call, returns envelope
ββͺβ
β
βΌ
ββββββββββββββββββββββ
β Python business β β auth, validation, provider dispatch
β logic β
βββββββββββ¬βββββββββββ
β
βΌ
ββββββββββββββββββββββ
β Provider adapters β β opengateway/providers/{openai,anthropic,bedrock,...}.py
βββββββββββ¬βββββββββββ
β
βΌ
ββββββββββββββββββββββ
β Upstream LLM API β β OpenAI / Anthropic / Bedrock / ...
ββββββββββββββββββββββ
FastAPI is the default. Python ecosystem, 700+ contributors, mature, boring.
Mojo on flare is for when you need a single static binary at the edge: sub-50 ms cold start, ~30 MB image, no pip install in your container.
They share the same provider adapters, the same auth, the same config. The Mojo to PythonObject boundary is one synchronous function call (handle_chat) that returns an envelope dict so the Mojo handler never catches Python exceptions.
Full layout in docs/architecture.md and the rationale in ADR-002.
opengateway/
βββ opengateway/
β βββ main.py # FastAPI server (default)
β βββ auth.py # Virtual key + root key auth
β βββ config.py # Settings via pydantic-settings
β βββ keys.py # API key generator (sk-og-{token})
β βββ router.py # Model-to-provider routing
β βββ providers/ # Provider adapters
β β βββ base.py
β β βββ openai.py
β βββ mojo/ # Mojo server on flare
β β βββ main.mojo
β β βββ router.mojo
β β βββ bridge.mojo
β βββ mojo_bridge/ # Python side of the Mojo bridge
β βββ auth.py
β βββ chat.py
βββ tests/
β βββ test_proxy.py # FastAPI server tests
β βββ test_mojo_bridge.py # Bridge tests
βββ docs/ # Documentation
β βββ architecture.md
β βββ release-process.md
β βββ assets/ # Banner image and other assets
βββ adr/ # Architecture Decision Records
βββ pyproject.toml # Python package config
βββ pixi.toml # Mojo environment config
βββ Dockerfile # Container build
βββ docker-compose.yml # Local dev stack (Postgres + Redis + gateway)
A few principles we hold ourselves to. They're non-negotiable.
- Every feature ships in the OSS build. SSO, audit logs, guardrails, advanced routing. None of them are paywalled. The code is the product.
- OpenAI-compatible is the API contract. Not "compatible-ish". Not "subset". The same request shape, the same response shape, the same error format.
- Boring tech where it matters. FastAPI, Postgres, Redis, Pydantic. We don't get bonus points for picking weird.
- New tech where it pays off. Mojo for the binary-deploy path. Conventional commits for release automation. Standard formats everywhere else.
- Tests in CI, not in promises. 23 Python tests today, more every week. No
// TODO: test this laterin main. - The README is a contract. If it doesn't run as written, the docs are wrong, not the code.
Shipped today:
- OpenAI-compatible
/v1/chat/completions - Virtual keys with model allow-lists
- Per-key budgets
- OpenAI provider adapter
- Dual server: FastAPI + Mojo on flare
- release-please to PyPI publishing
- Drop-in replacement for openai-python, openai-node, and Anthropic SDKs
Next up:
- Anthropic provider adapter plus Bedrock pass-through
- PostgreSQL-backed virtual keys (currently in-memory)
- Streaming SSE in the Mojo server
- Guardrails: PII detection, prompt injection, content moderation
- Audit log: structured events, queryable
- SSO: OIDC + SAML
- Rate limits: token-bucket per key, Redis-backed
- Adaptive routing: score-based provider selection
- Native Prometheus metrics endpoint
Long term:
- Managed SaaS, the Phase 3 from the strategy note
- Enterprise support contracts, the Phase 4 from the strategy note
| Doc | What it covers |
|---|---|
| docs/architecture.md | Runtime layout, dual-server design, Mojo to Python boundary |
| docs/release-process.md | release-please flow, conventional commits, PyPI trusted publisher |
| docs/mojo-python-ai-gateway.md | Original design sketch (historical rationale) |
| adr/001-api-key-format.md | The sk-og-{token} key format |
| adr/002-mojo-api-surface.md | Why Mojo for the API surface |
| CONTRIBUTING.md | Dev setup, commit conventions, PR process |
We accept contributions under DCO. Commits follow Conventional Commits. release-please uses your commit messages to drive the version bump and the changelog, so prefix your commits with feat:, fix:, docs:, etc.
# Set up
git clone https://github.com/echohello-dev/opengateway.git
cd opengateway
uv pip install -e ".[dev]"
# Run everything
make test # pytest
make lint # ruff + mypy
make format # ruff format
make mojo-test # Mojo router tests (requires pixi)
# Open a PR with a clear title and descriptionAlpha (0.x). The core proxy works end-to-end with the OpenAI provider. Anthropic and Bedrock adapters are routed but not yet implemented. Virtual keys and budgets are scaffolded. DB-backed persistence is the next milestone. Expect breaking changes before 1.0.
Watch releases for tagged versions, and check the open issues for the current roadmap.
MIT. The whole thing, forever. No telemetry, no callbacks, no surprise license change in 1.0.
Made with π Python and π₯ Mojo. Hosted on coffee.