Erutalia is a high-performance, asynchronous Retrieval-Augmented Generation (RAG) platform designed for scalable document intelligence. Built on a decoupled microservices architecture, the system orchestrates multi-format document ingestion, vector embeddings via Qdrant, stateful user session management via PostgreSQL, and LLM orchestration utilizing localized Ollama clusters or cloud-native AWS/GCP LLM runtimes.
The platform is fully containerized, instrumented for automated Infrastructure as Code (IaC) deployment via Ansible, and integrated into a robust CI/CD jenkins pipeline.
- Decoupled Multi-Format Ingestion Engine: Features an asynchronous standalone extractor optimized for processing high-throughput document types (
.pdf,.docx,.pptx,.png,.jpg). Utilizes on-the-fly streaming embedding generation to minimize memory footprints during large-scale ingestion. - Production-Grade Vector Search: Implements Qdrant Vector DB for sub-millisecond similarity searches, configured with optimized collection schemas for high-dimensional payload querying.
- Secure API Gateway: Powered by FastAPI with integrated asynchronous JWT-based Authentication Services for robust, stateless user session management and rate limiting.
- Infrastructure-as-Code & Automated Deployment: Fully provisioned via Ansible playbooks for automated Nginx reverse-proxy configuration, SSL termination, and seamless cloud deployments across AWS EC2 and GCP environments.
The platform utilizes a microservices topology to isolate concerns between ingestion workloads, stateful data storage, vector indices, and LLM inference pipelines.
graph LR
A[User, Q/A] --> C
C[API Server, FastAPI] --> A
C --> D[QDRant, VectorDB]
D --> E[Ollama LLm API, AWS]
C --> F[PostGres SQL, User Data]
G[Standone Ingestor] --> D
F --> C
E --> C
C --> B
B[AUTH Service, JWT ] --> C
-
Build overall from root dir all services up detached and contianerized
docker compose up --build -d -
Run in interactive mode without storing
docker compose up --build -
Stops and removes all containers + volumes (drops DB!)
docker-compose down -v
go to the directory /services/server to build docker compose up --build postgres to run docker compose up -d postgres
for building server go to services/server docker compose up --build server
note-for future use, use SQLalchemy2.X, in the requirments file for ingenstion
python3.12 -m app.main process
python -m app.main process --input-dir /path/to/your/documents
python -m app.main process-file /path/to/specific/document.pdf
python -m app.main show-errors
python -m app.main health
python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords')"
Found 298 supported files Supported formats: ['.pptx', '.pdf', '.png', '.docx', '.jpg', '.txt', '.jpeg']
degrading to CPU only pip3 uninstall -y sentence-transformers torch torchvision torchaudio
If your embeddings are large and you don’t need to keep points in memory at all, you can even build and upload them on the fly instead of collecting all points first:
curl -X GET "http://localhost:6333/collections" | jq
curl -X GET "http://localhost:6333/collections/university_documents" | jq
- change in /infra/ansible/inventory.ini
- change IP Adress and user
SERVER_IP/localhost ansible_user=release/local - check ping pong
ansible web -i inventory.ini -m ping - run for prod
ansible-playbook -i inventory.ini nginx.yml --ask-become-pass - run for local
ansible-playbook -i inventory.local.ini nginx.yml --ask-become-passAdditonal commands ansible web -i inventory.ini -m shell -a "uptime"ansible web -i inventory.ini -m whoamiSee what would change, without changing anything.ansible-playbook -i inventory.ini site.yml --check
- change IP Adress and user
- Check if nginx file is correctly written
ansible-playbook -i inventory.local.ini nginx.yml --check --diff -K
Start the jenkins server, if not installed utilize this command
ocker run -d -p 8080:8080 -p 50000:50000 -v jenkins_home:/var/jenkins_home --name local-jenkins jenkins/jenkins:lts
set the necessary SSH and pipeline repo.