Enterprise LLM Gateway

A provider-agnostic LLM routing layer built with Python, FastAPI, Streamlit, and SQLite. The project is designed to match the resume scope: unified access to 5 LLM platforms, task-aware routing across 8 model configurations, 3-tier fallback handling, and cost/latency observability.

What It Does

Exposes one SDK and one FastAPI endpoint for OpenAI, Anthropic, Google, Cohere, and Bedrock style providers.
Classifies requests into task types such as reasoning, coding, vision, summarization, chat, and embedding.
Routes by capability, estimated cost, health, and latency policy.
Applies a 3-tier fallback chain: preferred model, same-capability alternative, then low-cost recovery model.
Records every request in a SQLite ledger for success rate, latency, tokens, cost, and fallback analysis.
Ships with a Streamlit dashboard for live gateway metrics.
Runs locally without API keys using deterministic mock providers.

Quickstart

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest
uvicorn llm_gateway.api.app:app --reload

In another terminal:

streamlit run dashboard/app.py

Example request:

curl -X POST http://127.0.0.1:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"prompt":"Summarize the benefits of an LLM gateway","task_type":"summarization"}'

Architecture

Client / SDK
  -> FastAPI gateway
  -> Request classifier
  -> Routing policy
  -> Provider adapter
  -> Circuit breaker + fallback chain
  -> SQLite cost/latency ledger
  -> Streamlit dashboard

Resume Alignment

This project demonstrates an enterprise LLM abstraction layer:

5 LLM platforms: OpenAI, Anthropic, Google, Cohere, Bedrock.
8 model configurations: GPT-4o, GPT-4o Mini, Claude 3.5 Sonnet, Claude 3 Haiku, Gemini 1.5 Pro, Command R+, Bedrock Claude, Bedrock Titan.
Task-aware routing: capability, latency, health, and cost policies.
3-tier fallback: primary, provider/model alternative, final recovery provider.
Cost optimization: lower-cost models are preferred for simple tasks while premium models are reserved for reasoning, coding, and vision.
Observability: SQLite request ledger plus dashboard KPIs.

Configuration

Copy .env.example to .env if you want to connect real providers. Without keys, the gateway uses mock adapters so demos and tests remain reproducible.

cp .env.example .env

Project Layout

src/llm_gateway/
  adapters/       Provider adapter contracts and implementations
  api/            FastAPI application
  classifier.py   Task classifier
  config.py       Model catalog and routing policy
  gateway.py      Orchestration, fallback, and ledger recording
  ledger.py       SQLite telemetry store
  models.py       Shared Pydantic models
  router.py       Capability/cost/latency router
dashboard/app.py  Streamlit dashboard
tests/            Unit and API tests

Four Implementation Phases

Foundation and provider abstraction.
Routing, fallback, and reliability controls.
API, dashboard, and telemetry.
Packaging, tests, docs, and resume polish.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
dashboard		dashboard
examples		examples
src/llm_gateway		src/llm_gateway
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enterprise LLM Gateway

What It Does

Quickstart

Architecture

Resume Alignment

Configuration

Project Layout

Four Implementation Phases

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Enterprise LLM Gateway

What It Does

Quickstart

Architecture

Resume Alignment

Configuration

Project Layout

Four Implementation Phases

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages