AI-powered research workspace for intelligent PDF analysis, semantic search, document understanding, and conversational insights extraction.
DocuBOT is an intelligent document assistant designed to help users interact with PDFs through natural language conversations.
Users can upload research papers, reports, notes, documentation, or study material and instantly extract insights through conversational queries.
The application combines document processing, semantic retrieval, embeddings, and Retrieval Augmented Generation (RAG) workflows to provide context-aware responses.
✅ Upload PDF documents
✅ Intelligent document parsing
✅ Conversational document interaction
✅ Semantic search using embeddings
✅ Context-aware responses
✅ Research workspace interface
✅ Retrieval Augmented Generation (RAG)
✅ Real-time question answering
✅ Insight extraction from PDFs
✅ Clean modern UI
PDF Upload
│
▼
Document Processing
│
▼
Chunk Creation
│
▼
Embedding Generation
│
▼
Vector Storage
│
▼
Semantic Retrieval
│
▼
Context Extraction
│
▼
LLM Response Generation
- React.js
- Tailwind CSS
- JavaScript
- Python
- FastAPI
- REST APIs
- LangChain
- OpenAI API
- Embeddings
- Vector Database
- RAG Pipeline
- PDF Parsing
- Text Chunking
- Similarity Search
- Semantic Retrieval
DocuBOT/
│── frontend/
│── backend/
│── docs/
│── embeddings/
│── docubot-preview.png
│── requirements.txt
│── package.json
│── README.mdgit clone https://github.com/Krish02789/DocuBot.gitMove into folder:
cd DocuBotInstall dependencies:
pip install -r requirements.txtCreate environment variables:
OPENAI_API_KEY=your_api_keyRun backend:
python app.pyOR
uvicorn main:app --reloadInstall dependencies:
npm installStart application:
npm start📚 Research Assistant
📄 Resume Analysis
🧾 PDF Question Answering
📑 Documentation Assistant
🎓 Study Companion
🏢 Internal Knowledge Base
📋 Report Analysis
📈 Insight Extraction
- Multi-document support
- Memory-based chat
- Citation generation
- OCR integration
- Image understanding
- Voice interaction
- Local LLM support
- Document summarization
- Exportable responses
- Multi-user workspace
- Conversational document interaction
- Retrieval Augmented Generation
- Embedding-based search
- Semantic retrieval system
- Context-aware AI responses
- Production-ready architecture
- Modern research workspace UI
Krish Batra
📍 Rohtak, Haryana, India
🔗 LinkedIn
https://www.linkedin.com/in/krishbatra/
🐙 GitHub
https://github.com/Krish02789
📧 Email
batra.krishh@gmail.com
🌐 Portfolio
https://krish02789.github.io/Portfolio/
If you found this project useful:
⭐ Star this repository
🍴 Fork the project
🚀 Contribute improvements
MIT License
