Skip to content
This repository was archived by the owner on May 21, 2026. It is now read-only.

Panepo/Sado

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sado: RAGAS Evaluation Server

A FastAPI backend with a browser-based UI for evaluating RAG (Retrieval-Augmented Generation) outputs using RAGAS metrics. All LLM and embedding inference runs locally via Ollama.

Features

  • Single evaluation — score one RAG sample interactively through the browser UI
  • Batch evaluation — upload a .json or .csv file to score many samples at once
  • 10 built-in metrics — Faithfulness, Context Recall, Context Precision, Response Relevancy, Factual Correctness, Noise Sensitivity, Semantic Similarity, BLEU, ROUGE, and more
  • 100% local — no external API keys required; everything runs through Ollama
  • Zero-build frontend — single self-contained static/index.html with no framework or build step

Prerequisites

  • Python 3.11 or 3.12
  • Ollama running locally with at least one LLM and one embedding model pulled

Setup

1. Clone and install dependencies

git clone <repo-url>
cd Sado
python install_dependency.py

2. Configure environment

Create a .env file in the project root:

OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_LLM_MODEL=llama3.2
OLLAMA_EMBED_MODEL=nomic-embed-text
Variable Description
OLLAMA_BASE_URL Base URL of your Ollama instance
OLLAMA_LLM_MODEL Model name for LLM-based metrics (e.g. llama3.2, qwen3)
OLLAMA_EMBED_MODEL Model name for embedding-based metrics (e.g. nomic-embed-text)
OLLAMA_NUM_CTX (optional) Context window size — default 8192
OLLAMA_MAX_TOKENS (optional) Max tokens for LLM responses — derived from OLLAMA_NUM_CTX

3. Start the server

uvicorn server:app --reload --port 8040

Open http://localhost:8040 in your browser.

Usage

Single Evaluation

Fill in the fields in the Single tab:

Field Description
user_input The original question or query
response The answer generated by your RAG system
retrieved_contexts One context chunk per line
reference Ground-truth answer (required by some metrics)

Select one or more metrics and click Evaluate.

Batch Evaluation

Upload a file in the Batch tab. Supported formats:

JSON — array of objects:

[
  {
    "user_input": "What is the capital of France?",
    "response": "Paris",
    "retrieved_contexts": ["France is a country in Europe. Its capital is Paris."],
    "reference": "Paris"
  }
]

CSV — columns matching field names; retrieved_contexts must be a JSON array string:

user_input,response,retrieved_contexts,reference
"What is the capital of France?","Paris","[""France is a country. Its capital is Paris.""]","Paris"

Available Metrics

Metric Required Fields Needs LLM Needs Embeddings
Faithfulness user_input, response, retrieved_contexts Yes No
LLM Context Recall user_input, retrieved_contexts, reference Yes No
LLM Context Precision user_input, retrieved_contexts, reference Yes No
Context Precision (No Reference) user_input, response, retrieved_contexts Yes No
Response Relevancy user_input, response Yes Yes
Factual Correctness response, reference Yes No
Noise Sensitivity user_input, retrieved_contexts, response, reference Yes No
Semantic Similarity response, reference No Yes
BLEU Score response, reference No No
ROUGE Score response, reference No No

API Reference

Endpoint Method Description
/api/metrics GET List all available metrics and their metadata
/api/evaluate/single POST Evaluate a single RAG sample (JSON body)
/api/evaluate/batch POST Evaluate a file of samples (multipart form)

POST /api/evaluate/single

{
  "user_input": "...",
  "response": "...",
  "retrieved_contexts": ["..."],
  "reference": "...",
  "metrics": ["faithfulness", "bleu_score"]
}

Returns:

{
  "scores": {
    "faithfulness": 0.85,
    "bleu_score": 0.42
  }
}

Docker

docker build -t ragas-server .
docker run -p 8040:8040 --env-file .env ragas-server

Note: The container needs network access to your Ollama instance. If Ollama runs on the host, use host.docker.internal as the hostname on Mac/Windows, or --network host on Linux.

Project Structure

server.py               FastAPI app, REST endpoints, static file serving
ragas_runner.py         Ollama LLM/embedding setup, metric registry, evaluate()
static/index.html       Single-page browser UI
install_dependency.py   Installs all Python dependencies
Dockerfile              Container image definition
.env                    Runtime config (gitignored)

Adding a New Metric

  1. Import the metric class from ragas.metrics.collections in ragas_runner.py.
  2. Add an entry to METRIC_REGISTRY with required_fields, needs_llm, needs_embedding, and cls.
  3. No changes needed in server.py or index.html — both pick it up automatically.

About

A FastAPI backend RAGAS Evaluation Server

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors