A private, Dockerized Retrieval-Augmented Generation (RAG) backend with document upload and contextual chat capabilities.
This project provides a powerful FastAPI-based backend for securely chatting with an AI assistant that understands your custom documents.
-
📄 PDF Document Upload
Upload and embed PDF files for contextual AI interactions. -
💬 Chat Interface
Engage in natural language conversation, with responses grounded in your uploaded documents. -
🧠 RAG Architecture
Combines large language models with document context retrieval usingOllamaandChromaDB.
docker-compose up --build -d
docker-compose exec ollama sh -c 'ollama_ai_rag pull $OLLAMA_MODEL && ollama_ai_rag pull $OLLAMA_EMBED_MODEL'Ensure your .env file includes the model names:
OLLAMA_MODEL=qwen2.5:1.5b
OLLAMA_EMBED_MODEL=nomic-embed-textIf you want to switch to a different model or clean the database:
# WARNING: This removes all volume data
docker compose down
docker compose down -vRestart your terminal session and run:
docker-compose up -d --force-recreate
docker-compose exec ollama sh -c 'ollama_ai_rag pull $OLLAMA_MODEL && ollama_ai_rag pull $OLLAMA_EMBED_MODEL'Make sure the new embedding model is compatible!
The API is fully documented using FastAPI’s interactive Swagger UI.
🔗 Visit http://localhost:8000/docs after starting the containers to explore and test the endpoints directly in your browser.
- Accepts:
collection_name,chat_id?,words - Returns: Assistant response + updated chat history
- Accepts:
file,collection_name, optionalmetadata - Splits and embeds PDF content into your vector DB.
- Accepts:
text,collection_name, optionalid, optionalmetadata - Directly inserts single document entries into the vector store.
- FastAPI – Web framework for API routes
- ChromaDB – Vector store for document embeddings
- Ollama – Lightweight LLM runtime and embedding generator
- LangChain – PDF parsing and chunking
- Docker Compose – Container orchestration for easy deployment
- 🧾 Upload a PDF to
/embed-pdf - 🧠 Ask a question at
/chatreferencing the uploaded collection - 💬 Get contextual responses based on your document content
- Ensure documents are properly formatted and readable before upload.
- Metadata allows for future filtering and contextual refinement.
Feel free to fork and extend this repo. PRs and suggestions are welcome!