Supports two transcription backends:
- Local Whisper (default) — fast, private, no API key needed. Uses faster-whisper (CTranslate2 backend) for 2-4x faster inference than stock Whisper, with GPU acceleration.
- LLM via OpenRouter — uses any audio-capable model on OpenRouter (defaults to Gemini 2.0 Flash).
hotmic_toggle.shstarts/stops dictation via a single keybindingsoxrecords continuously from your microphone (no audio gaps)- Each chunk is transcribed by the configured backend (local Whisper or OpenRouter LLM)
- The result is typed into the focused window via
xdotool - A pulsing red "REC" overlay badge shows while recording
Audio is streamed continuously via a pipe to the worker process — no gaps between chunks. The worker splits on speech pauses or every 20s, and transcribes in a background thread while audio keeps flowing. The Whisper model stays loaded to avoid reload latency.
- Linux with X11
sox(audio recording and silence detection)xdotoolpython3with PyGObject and cairo (for the recording indicator)
For Whisper backend (default):
python3withfaster-whisperinstalled (pip install faster-whisper nvidia-cublas-cu12 nvidia-cudnn-cu12)- CUDA GPU recommended (falls back to CPU)
- For GPU: NVIDIA CUDA libraries (
pip install nvidia-cublas-cu12 nvidia-cudnn-cu12)
For LLM backend:
curljq- An OpenRouter API key
sudo apt install sox xdotool python3-gi python3-gi-cairo gir1.2-gtk-3.0
# For whisper backend:
pip install faster-whisper nvidia-cublas-cu12 nvidia-cudnn-cu12
# For LLM backend:
sudo apt install curl jqsudo pacman -S sox xdotool python-gobject python-cairo
# For whisper backend:
pip install faster-whisper nvidia-cublas-cu12 nvidia-cudnn-cu12
# For LLM backend:
sudo pacman -S curl jqgit clone https://github.com/bmilleare/hotmic.git
cd hotmic
chmod +x hotmic_toggle.sh hotmic_start.sh hotmic_stop.sh hotmic_indicator.py hotmic_whisper_worker.pyWhisper (default) — no configuration needed. Just ensure openai-whisper is installed.
LLM — set HOTMIC_BACKEND=llm and provide your OpenRouter API key:
echo 'OPENROUTER_API_KEY="sk-or-v1-your-key-here"' > /path/to/hotmic/.envAlternatively, create ~/.config/hotmic/env with the same content, or export it in your shell profile. The script checks these locations in order:
- Environment variable (already set)
.envfile next to the script~/.config/hotmic/env
3. Bind hotmic_toggle.sh to a keyboard shortcut in your desktop environment's settings. For example, in GNOME:
Settings > Keyboard > Custom Shortcuts > Add:
Name: Dictation
Command: /path/to/hotmic/hotmic_toggle.sh
Shortcut: (your choice, e.g. Insert)
Edit the variables at the top of hotmic_start.sh, or override them via environment variables:
| Variable | Default | Description |
|---|---|---|
HOTMIC_BACKEND |
whisper |
whisper (local) or llm (OpenRouter) |
WHISPER_MODEL |
medium.en |
Whisper model: tiny, base, small, medium.en, large-v3-turbo, etc. |
WHISPER_DEVICE |
cuda |
cuda for GPU, cpu for CPU-only |
OPENROUTER_MODEL |
google/gemini-2.0-flash-001 |
Any audio-capable model on OpenRouter (LLM backend only) |
SILENCE_START_THRESH |
3% |
Threshold to detect speech start (must be above ambient noise) |
SILENCE_STOP_THRESH |
3% |
Threshold to detect pauses (must be above ambient noise) |
SILENCE_DUR |
0.8 |
Seconds of silence before a chunk ends |
MAX_CHUNK_SEC |
20 |
Hard cap per chunk (silence split handles most cases) |
| File | Purpose |
|---|---|
hotmic_toggle.sh |
Start/stop dictation (bind to a hotkey) |
hotmic_start.sh |
Main recording + transcription loop |
hotmic_stop.sh |
Stops recording and cleans up |
hotmic_indicator.py |
Pulsing "REC" overlay badge |
hotmic_whisper_worker.py |
Persistent Whisper process (loads model once) |
Logs are written to /tmp/hotmic/hotmic.log for debugging.
MIT
