A FastAPI backend with a browser-based UI for evaluating RAG (Retrieval-Augmented Generation) outputs using RAGAS metrics. All LLM and embedding inference runs locally via Ollama.
- Single evaluation — score one RAG sample interactively through the browser UI
- Batch evaluation — upload a
.jsonor.csvfile to score many samples at once - 10 built-in metrics — Faithfulness, Context Recall, Context Precision, Response Relevancy, Factual Correctness, Noise Sensitivity, Semantic Similarity, BLEU, ROUGE, and more
- 100% local — no external API keys required; everything runs through Ollama
- Zero-build frontend — single self-contained
static/index.htmlwith no framework or build step
- Python 3.11 or 3.12
- Ollama running locally with at least one LLM and one embedding model pulled
git clone <repo-url>
cd Sado
python install_dependency.pyCreate a .env file in the project root:
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_LLM_MODEL=llama3.2
OLLAMA_EMBED_MODEL=nomic-embed-text| Variable | Description |
|---|---|
OLLAMA_BASE_URL |
Base URL of your Ollama instance |
OLLAMA_LLM_MODEL |
Model name for LLM-based metrics (e.g. llama3.2, qwen3) |
OLLAMA_EMBED_MODEL |
Model name for embedding-based metrics (e.g. nomic-embed-text) |
OLLAMA_NUM_CTX |
(optional) Context window size — default 8192 |
OLLAMA_MAX_TOKENS |
(optional) Max tokens for LLM responses — derived from OLLAMA_NUM_CTX |
uvicorn server:app --reload --port 8040Open http://localhost:8040 in your browser.
Fill in the fields in the Single tab:
| Field | Description |
|---|---|
user_input |
The original question or query |
response |
The answer generated by your RAG system |
retrieved_contexts |
One context chunk per line |
reference |
Ground-truth answer (required by some metrics) |
Select one or more metrics and click Evaluate.
Upload a file in the Batch tab. Supported formats:
JSON — array of objects:
[
{
"user_input": "What is the capital of France?",
"response": "Paris",
"retrieved_contexts": ["France is a country in Europe. Its capital is Paris."],
"reference": "Paris"
}
]CSV — columns matching field names; retrieved_contexts must be a JSON array string:
user_input,response,retrieved_contexts,reference
"What is the capital of France?","Paris","[""France is a country. Its capital is Paris.""]","Paris"| Metric | Required Fields | Needs LLM | Needs Embeddings |
|---|---|---|---|
| Faithfulness | user_input, response, retrieved_contexts |
Yes | No |
| LLM Context Recall | user_input, retrieved_contexts, reference |
Yes | No |
| LLM Context Precision | user_input, retrieved_contexts, reference |
Yes | No |
| Context Precision (No Reference) | user_input, response, retrieved_contexts |
Yes | No |
| Response Relevancy | user_input, response |
Yes | Yes |
| Factual Correctness | response, reference |
Yes | No |
| Noise Sensitivity | user_input, retrieved_contexts, response, reference |
Yes | No |
| Semantic Similarity | response, reference |
No | Yes |
| BLEU Score | response, reference |
No | No |
| ROUGE Score | response, reference |
No | No |
| Endpoint | Method | Description |
|---|---|---|
/api/metrics |
GET |
List all available metrics and their metadata |
/api/evaluate/single |
POST |
Evaluate a single RAG sample (JSON body) |
/api/evaluate/batch |
POST |
Evaluate a file of samples (multipart form) |
{
"user_input": "...",
"response": "...",
"retrieved_contexts": ["..."],
"reference": "...",
"metrics": ["faithfulness", "bleu_score"]
}Returns:
{
"scores": {
"faithfulness": 0.85,
"bleu_score": 0.42
}
}docker build -t ragas-server .
docker run -p 8040:8040 --env-file .env ragas-serverNote: The container needs network access to your Ollama instance. If Ollama runs on the host, use
host.docker.internalas the hostname on Mac/Windows, or--network hoston Linux.
server.py FastAPI app, REST endpoints, static file serving
ragas_runner.py Ollama LLM/embedding setup, metric registry, evaluate()
static/index.html Single-page browser UI
install_dependency.py Installs all Python dependencies
Dockerfile Container image definition
.env Runtime config (gitignored)
- Import the metric class from
ragas.metrics.collectionsinragas_runner.py. - Add an entry to
METRIC_REGISTRYwithrequired_fields,needs_llm,needs_embedding, andcls. - No changes needed in
server.pyorindex.html— both pick it up automatically.