RAG Projects

A collection of production-quality RAG systems built with Weaviate — exploring what it actually takes to go beyond "vector DB plus a prompt."

Architecture & Philosophy

Robust RAG is not simply a "vector DB plus a prompt" — it's a layered decision system where each stage has a clear responsibility:

Stage	Responsibility
Ingest	Normalize, parse structure, attach metadata — every source is versioned for traceability
Index	Embeddings + BM25; parent-child storage preserves context without sacrificing retrieval precision
Retrieve	Hybrid search, MMR diversity filtering, relevance thresholding — bad evidence never reaches the model
Generate	Bounded prompt — the model answers only from retrieved context, never from parametric memory
Evaluate	Instrument faithfulness and relevance at every stage; silent failures are the hardest to catch in production

This layering directly mitigates the three root causes of most production RAG failures: bad evidence, weak retrieval, and poor uncertainty handling.

Framework Philosophy

The guiding principle in this repo is simple: use frameworks only where they remove undifferentiated plumbing, and avoid them where they obscure the critical path.

LlamaIndex fits this principle at the ingestion boundary. It provides fast, flexible document parsing and chunking, eliminating routine wiring that adds no real value.

When a use case pushes beyond generic framework behaviour — tighter latency, customisation needs, or stability constraints — the strategy shifts to a minimal custom orchestration layer, where retrieval logic, prompt construction, and observability remain explicit and fully transparent.

Projects

pdf-rag-ts — TypeScript

PDF Q&A with hybrid BM25 + semantic search, four chunking strategies, query expansion, MMR diversity filtering, relevance thresholding, and HyDE. Powered by Gemini for embeddings and generation, LlamaParse for structured PDF parsing.

Stack: TypeScript · Weaviate · Gemini · LlamaParse

pdf-rag-python — Python

The Python sibling — same hybrid search pipeline with a Redis-backed semantic cache that serves repeated or semantically similar questions without hitting the LLM again.

Stack: Python · Weaviate · Ollama · LlamaParse · Redis

rag-tutorial — Tutorial

Builds a RAG pipeline from scratch over a 7k-book dataset. Covers collection setup, ingestion, semantic search, and generative search — a clean starting point for understanding how the pieces fit together.

Stack: TypeScript · Weaviate · Ollama

See each project's README for full setup instructions.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
pdf-rag-python		pdf-rag-python
pdf-rag-ts		pdf-rag-ts
rag-tutorial		rag-tutorial
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Projects

Architecture & Philosophy

Framework Philosophy

Projects

pdf-rag-ts — TypeScript

pdf-rag-python — Python

rag-tutorial — Tutorial

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Projects

Architecture & Philosophy

Framework Philosophy

Projects

pdf-rag-ts — TypeScript

pdf-rag-python — Python

rag-tutorial — Tutorial

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages