Token Guardian is a CLI-first guardrail for checking prompt size, context pressure, and estimated cost before you call an LLM.
It helps developers answer three questions quickly:
- how many tokens this prompt will probably use
- how much this request may cost
- whether this prompt is risky for the selected context window
The current CLI interface is optimized for interactive terminal use and the guided experience is presented in pt-BR.
- catch oversized prompts before they hit the model
- estimate cost before expensive runs
- compare supported models using the same prompt
- clean duplicated or bloated prompt text
- keep simple local observability with SQLite metrics
- start from an interactive terminal menu instead of memorizing commands
pip install token-guardian
token-guardianpip install token-guardian
token-guardianRunning token-guardian without arguments opens an interactive menu when your terminal supports it.
- analyze one prompt for one provider/model pair
- compare one prompt across the default supported models
- optimize prompt text by removing duplicates and excess whitespace
- list supported models
- sync the local model catalog snapshot
- inspect local usage metrics
- guide the full flow through an interactive menu
token-guardianRunning without arguments shows the available flow and the most useful commands to start with.
In interactive terminals, Token Guardian opens a guided menu with:
- provider selection
- model selection
- prompt entry with Enter to send
- sync selection by provider
- quick access to models and metrics
token-guardian analyze \
--provider anthropic \
--model claude-sonnet-4 \
--prompt "Review this architecture proposal and identify risks."token-guardian compare \
--prompt "Summarize this technical RFC and list migration risks."token-guardian optimize \
--prompt "Goal: summarize
Goal: summarize
Return bullets only."token-guardian modelstoken-guardian sync-models
token-guardian sync-models --provider openaitoken-guardian metricsTypical analyze output is rendered as a terminal report with token estimate, cost, context usage, risk, and prompt guidance.
Current built-in catalog:
- OpenAI:
gpt-4.1 - Anthropic:
claude-sonnet-4,claude-opus-4 - Google:
gemini-2.5-pro,gemini-2.5-flash - OpenRouter:
openai/gpt-4.1
Each model stores:
- context limit
- input price per 1K tokens
- output price per 1K tokens
- speed estimate
- source URL
The CLI also shows catalog metadata such as:
Catalogo atualizado em 2026-06-13- the current JSON snapshot path
Based on estimated context usage:
lowmediumhighcritical
Range: 0 to 100
Factors include:
- prompt size
- repeated lines
- repeated vocabulary
- redundant sections
$: very low$$: low$$$: medium$$$$: high
SimpleMediumComplexVery Complex
Token Guardian stores local metrics in SQLite.
Database file:
token_guardian.db
Tracked fields include:
- total requests
- total tokens
- estimated cumulative cost
- top models
- top providers
token-guardian/
|-- app/
| |-- cli.py
| |-- models/
| |-- providers/
| |-- services/
| `-- utils/
|-- docs/
|-- tests/
|-- LICENSE
|-- pyproject.toml
`-- README.md
Run tests:
pytestRun quality checks:
ruff check .
black --check .
mypy app- add richer interactive CLI flows
- expand supported model catalog
- improve prompt optimization heuristics
- add exportable reports
- add model catalog sync support
MIT