In real-world meetings, important discussions are frequently unrecorded, action items get missed, and manual note-taking proves inefficient. Crucially, visually impaired users often cannot access meeting outputs easily.
IntelliMeet is a real-time AI pipeline system designed to bridge these gaps. It captures live speech, converts it into text, analyzes the content using large language models (LLMs) to extract summaries and actionable items, and provides accessible outputs in both Voice (Text-to-Speech) and Braille formats.
It seamlessly processes:
Speech → Text → Intelligence → Structured Output → Accessibility
- Automatically capture live speech and convert it into text.
- Analyze meeting content using advanced AI and Natural Language Processing.
- Extract structured summaries and assignable action items.
- Provide accessible output formats through Voice and Braille translation.
- Store meeting history persistently for future retrieval and tracking.
The pipeline flows seamlessly from input to storage:
Microphone
↓
Speech Recognition
↓
Transcript
↓
LLM Processing
↓
Summary + Action Items
↓
Accessibility Layer (Voice + Braille)
↓
Storage (JSON)
The system is divided into five main modules:
- Speech-to-Text Module: Captures live audio and converts speech into a meeting transcript.
- Meeting Analyzer Module: Processes the transcript via an LLM to generate a concise summary and extract action items.
- Voice Output Module: Converts the textual output into speech, reading it aloud for users.
- Braille Converter Module: Translates the text into Braille Unicode for tactile accessibility display.
- Storage Module: Saves the finalized meeting results (transcript, summary, action items) as a JSON file.
- Capture live speech.
- Convert speech to text.
- Send the text into the configured LLM API.
- Extract meeting summary and action items.
- Convert the final output to voice (TTS).
- Convert the output to Braille characters.
- Save the results to local storage.
- Language: Python
- Key Libraries:
SpeechRecognitionOpenAI/GroqAPIpyttsx3Streamlit(for the web interface)
- Tools: VS Code, Git/GitHub, Docker (for deployment logistics)
- Minimum: Laptop (8GB RAM recommended), Built-in Microphone, Internet connection (for the LLM API).
- Optional (Recommended): External microphone for improved speech recognition accuracy.
Sample Input:
"We will deploy the model using Docker. Rahul will prepare the dataset."
Sample Output - Generated Details:
- Summary: Discussion about deploying the ML model using Docker.
- Action Items:
- Rahul → Prepare dataset
- Satish → Deploy Docker container
Accessibility Output:
- Voice Output: The system reads aloud: "Meeting summary ready."
- Braille Output:
⠍⠑⠑⠞⠊⠝⠛ ⠎⠥⠍⠍⠁⠗⠽
- ⏱️ Saves Time & Manual Effort: Automates the traditionally tedious task of note-taking.
- 📈 Improves Productivity: Keeps track of critical decisions and action items seamlessly.
- 🤝 Supports Accessibility: Empowers visually impaired individuals through tailored outputs.
- 🔍 Structured Insights: Converts raw, messy conversations into organized deliverables.
- Highly dependent on speech clarity and input audio quality.
- Requires an active internet connection for LLM processing.
- Accuracy may be limited in noisy or crowded environments.
- Speaker identification (Diarization).
- Real-time streaming optimizations.
- Cloud deployment and multi-device synchronization.
- Advanced semantic meeting search system.
- Multi-language support.
- Corporate boardrooms and team meetings.
- Educational classrooms and lectures.
- Remote/Online meetings (Zoom/Teams).
- Accessibility tools for assistive technology platforms.
- Enterprise productivity software.
IntelliMeet demonstrates how AI can be utilized to automate meeting understanding and ensure accessibility for all users by combining speech processing, NLP, and assistive technology.