DocuMind is a multi-user Retrieval-Augmented Generation (RAG) platform. It secures user documents through complete Google OAuth authentication, local/cloud PostgreSQL storage, and isolated ChromaDB vector indexing (metadata filtering).
- Google OAuth 2.0 Integration: Authenticate securely using Google Login, automatically provisioning user accounts.
- Session Security (HttpOnly Cookies): Tokens are signed on the backend (JWT) and stored securely in HttpOnly, SameSite cookies to mitigate XSS risks.
- Strict Document Isolation: PostgreSQL collections utilize metadata filters matching the authenticated database
user_id. Users can never query, list, or delete another user's documents. - Relational Data Mapping: SQLAlchemy models track Users, Documents, and Vectors in PostgreSQL (Neon serverless setup with pgvector).
- Real-time Status Polling: The frontend vault tracks document parsing and embedding status dynamically.
- Animated Dark UI: Designed with TailwindCSS v4 and Framer Motion for a modern, glassmorphic dark-theme console experience.
flowchart TD
User([User]) --> |Uploads PDF| API_Upload[FastAPI /api/upload]
subgraph Ingestion Pipeline
API_Upload --> Extractor[Extract Text]
Extractor --> Chunker[Chunking]
Chunker --> Embedder[Embedding Model]
Extractor --> Summarizer[LLM Summary Generation]
end
Embedder --> |Insert Chunks & Embeddings| DB[(Neon PostgreSQL\npgvector)]
Summarizer --> |Insert Summary & Topics| DB
User --> |Chat Query| API_Chat[FastAPI /api/chat]
API_Chat --> Router{Intent Router}
Router -->|GREETING / SMALL_TALK| LLM_Greet[LLM Greeting Prompt]
Router -->|DOC_SUMMARY / DOC_OVERVIEW| DB_Sum[Fetch Stored Summaries\nfrom PostgreSQL]
Router -->|DOC_QUERY| DB_Vec[pgvector Similarity Search\n+ BM25 Reranking]
DB_Sum --> LLM_RAG[LLM RAG Prompt]
DB_Vec --> LLM_RAG
LLM_Greet --> Response[Streaming Response]
LLM_RAG --> Response
Response --> User
DocuMind/
├── backend/
│ ├── alembic/ # Database migrations
│ ├── chroma_db/ # Persistent Chroma database storage
│ ├── src/
│ │ ├── auth/ # Auth package (Google verify, JWT, dependencies)
│ │ ├── database/ # DB package (SQLAlchemy connections and models)
│ │ ├── data_loader.py # Document parser (PDF, TXT, CSV, DOCX, JSON, Excel)
│ │ ├── embedding.py # Text splitter & Embedding pipe
│ │ ├── search.py # RAG retrieval & Groq LLM logic
│ │ └── vectorstore.py # Chroma Store client with user isolation
│ ├── Dockerfile
│ ├── main.py # FastAPI application router
│ └── requirements.txt # Backend python dependencies
├── frontend/
│ ├── src/
│ │ ├── app/
│ │ │ ├── dashboard/ # Protected dashboard workspaces
│ │ │ ├── login/ # Google sign-in landing card
│ │ │ ├── globals.css # Styling sheets with TailwindCSS import
│ │ │ ├── layout.tsx # Root html layout shell
│ │ │ └── page.tsx # Auto-routing landing page
│ │ └── middleware.ts # Server-side auth route guard middleware
│ ├── package.json
│ └── tsconfig.json
├── docker-compose.yml # FastAPI + local PostgreSQL config
└── README.md
- Node.js
- Python (3.10+)
- Google Cloud Console client credentials (configured origin )
- Groq API Key (for LLM RAG inference)
Create .env in the backend/ directory:
GROQ_API_KEY="your-groq-api-key"
DATABASE_URL="postgresql://neondb_owner:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.tech/neondb?sslmode=require"
GOOGLE_CLIENT_ID="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com"
JWT_SECRET="rag-for-docs-super-secret-key-change-this-in-production"Create .env in the frontend/ directory:
NEXT_PUBLIC_BACKEND_URL="http://localhost:8000"- Open terminal in
backend/folder:cd backend # Ensure virtual environment is ready uv venv # Activate it (Windows) .venv\Scripts\activate # Install dependencies uv pip install -r requirements.txt
- Apply database schemas to PostgreSQL:
python -m alembic upgrade head
- Run the FastAPI development server:
The backend API will run on
uvicorn main:app --reload
http://localhost:8000.
- Open another terminal in the
frontend/folder:The client application will run oncd frontend npm install npm run devhttp://localhost:3000.
To deploy the stack locally with a local PostgreSQL server, run the following from the root workspace directory:
docker-compose up --buildThis launches a PostgreSQL container mapped to port 5432 and builds/starts the FastAPI backend container listening on port 8000.