An advanced chatbot for preliminary medical diagnosis using open source Mistral LLM, LangChain, and PineconeDB. It supports:
- Symptom-based queries
- Analysis of user-inputted test results
- Probability-based disease predictions
The backend is built using Flask, ensuring lightweight and responsive deployment.
- 🔍 Retrieval-Augmented Generation (RAG) powered by open-source Mistral LLM
- ⚕️ Ingests and embeds medical knowledge from the Gale Encyclopedia of Medicine using Hugging Face models
- 📦 Stores vector embeddings in Pinecone for fast similarity-based retrieval
- 🧠 Context-aware responses generated by combining the user query with relevant medical context
- 🌐 Simple and clean frontend built with HTML/CSS
- 🐳 Fully containerized using Docker
- 🔁 CI/CD pipeline integrated for automatic deployment and testing
| Layer | Tools/Frameworks Used |
|---|---|
| Frontend | HTML, CSS |
| Backend | Python, Flask |
| RAG Engine | LangChain, Pinecone, Mistral LLM (via Hugging Face) |
| Vector DB | Pinecone (FAISS alternative for production) |
| Container | Docker |
| Deployment | CI/CD pipeline |
-
📘 Embedding Medical Data
- Medical content is embedded into vector representations using Hugging Face transformers.
- These embeddings are stored in Pinecone for fast retrieval.
-
🔎 Query Processing
- A user submits a symptom or medical test-related query.
- LangChain fetches relevant context using vector similarity from Pinecone.
-
🧾 LLM-Based Answering
- Mistral LLM uses retrieved documents and user input to generate medically coherent answers.
-
🔁 Session Handling
- Conversations are state-aware using LangChain's memory components.
This project is licensed under the MIT License.
Kreesh Modi | IIT Kharagpur Mechanical Engineering
Email: [kreeshmodi2018@gmail.com]