Skip to content

X-McKay/nemo

Repository files navigation

NeMo Live Speaker Diarization & Transcription

Real-time meeting transcription with fast speaker identification using NVIDIA NeMo.

🚀 Features

  • Fast Speaker Identification: 1-2 second latency (2-3x faster than traditional approaches)
  • Hybrid Diarization: Combines fast TitaNet embeddings with accurate MSDD refinement
  • Self-Correcting Labels: Automatically improves speaker assignments over time
  • Live Transcription: Real-time speech-to-text with Canary ASR
  • GPU-Accelerated: Optimized for NVIDIA GPUs

🎯 Quick Start

Installation

# Install dependencies
pip install -r requirements.txt

# Or using uv (recommended)
uv pip install -r requirements.txt

Run Live Transcription (GPU Required)

# Use default settings
python main.py

# Or specify GPU device
python main.py --device cuda:0

Example Output:

[001.0–003.0] (Speaker_0): hello everyone
[003.0–005.0] (Speaker_0): welcome to the meeting
[005.5–007.5] (Speaker_1): thanks for having me

⚙️ Configuration

Key settings in Config class:

# Fast diarization
embed_window_seconds = 1.5       # Embedding window
embed_hop_seconds = 0.75         # Embedding frequency
speaker_similarity_threshold = 0.60    # New speaker threshold

See inline comments in main.py for full details.

📊 Performance

Metric Value
Time to first speaker label 1-2 seconds ⚡
Time to new speaker label 1-2 seconds ⚡
Initial accuracy (fast path) 80-85%
Final accuracy (after MSDD) 90-95%
GPU memory required 8-10GB

📁 Project Structure

nemo/
├── main.py                          # Enhanced live diarization
├── cli.py                           # CLI version with file output
├── requirements.txt                 # Python dependencies
├── README.md                        # This file
└── summary.md                       # Optimization recommendations

🐛 Troubleshooting

CUDA out of memory: Reduce batch size to 32 Too many speakers: Increase threshold to 0.65 Speakers merged: Lower threshold to 0.55

✅ Quick Test

python test_fast_diarization.py  # Test components
python main.py                    # Go live!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors