Skip to content

Krish02789/DocuBot

Repository files navigation

🤖 DocuBOT

AI-powered research workspace for intelligent PDF analysis, semantic search, document understanding, and conversational insights extraction.

Python React FastAPI LangChain RAG OpenAI


📌 About The Project

DocuBOT is an intelligent document assistant designed to help users interact with PDFs through natural language conversations.

Users can upload research papers, reports, notes, documentation, or study material and instantly extract insights through conversational queries.

The application combines document processing, semantic retrieval, embeddings, and Retrieval Augmented Generation (RAG) workflows to provide context-aware responses.


🚀 Features

✅ Upload PDF documents

✅ Intelligent document parsing

✅ Conversational document interaction

✅ Semantic search using embeddings

✅ Context-aware responses

✅ Research workspace interface

✅ Retrieval Augmented Generation (RAG)

✅ Real-time question answering

✅ Insight extraction from PDFs

✅ Clean modern UI


🧠 Workflow

PDF Upload
     │
     ▼
Document Processing
     │
     ▼
Chunk Creation
     │
     ▼
Embedding Generation
     │
     ▼
Vector Storage
     │
     ▼
Semantic Retrieval
     │
     ▼
Context Extraction
     │
     ▼
LLM Response Generation

🛠 Tech Stack

Frontend

  • React.js
  • Tailwind CSS
  • JavaScript

Backend

  • Python
  • FastAPI
  • REST APIs

AI Stack

  • LangChain
  • OpenAI API
  • Embeddings
  • Vector Database
  • RAG Pipeline

Utilities

  • PDF Parsing
  • Text Chunking
  • Similarity Search
  • Semantic Retrieval

📂 Project Structure

DocuBOT/
│── frontend/
│── backend/
│── docs/
│── embeddings/
│── docubot-preview.png
│── requirements.txt
│── package.json
│── README.md

⚙ Installation

Clone Repository

git clone https://github.com/Krish02789/DocuBot.git

Move into folder:

cd DocuBot

Backend Setup

Install dependencies:

pip install -r requirements.txt

Create environment variables:

OPENAI_API_KEY=your_api_key

Run backend:

python app.py

OR

uvicorn main:app --reload

Frontend Setup

Install dependencies:

npm install

Start application:

npm start

📸 Project Preview


🎯 Use Cases

📚 Research Assistant

📄 Resume Analysis

🧾 PDF Question Answering

📑 Documentation Assistant

🎓 Study Companion

🏢 Internal Knowledge Base

📋 Report Analysis

📈 Insight Extraction


🌟 Future Improvements

  • Multi-document support
  • Memory-based chat
  • Citation generation
  • OCR integration
  • Image understanding
  • Voice interaction
  • Local LLM support
  • Document summarization
  • Exportable responses
  • Multi-user workspace

📈 Highlights

  • Conversational document interaction
  • Retrieval Augmented Generation
  • Embedding-based search
  • Semantic retrieval system
  • Context-aware AI responses
  • Production-ready architecture
  • Modern research workspace UI

👨‍💻 Author

Krish Batra

📍 Rohtak, Haryana, India

🔗 LinkedIn
https://www.linkedin.com/in/krishbatra/

🐙 GitHub
https://github.com/Krish02789

📧 Email
batra.krishh@gmail.com

🌐 Portfolio
https://krish02789.github.io/Portfolio/


⭐ Support

If you found this project useful:

⭐ Star this repository

🍴 Fork the project

🚀 Contribute improvements


📜 License

MIT License

About

AI-powered research workspace for PDF analysis, semantic search, document understanding, and conversational insight extraction using RAG workflows.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors