Press a hotkey. Speak. Your words appear at the cursor. No cloud. No subscription. No telemetry. Powered by OpenAI Whisper.
Quick Start Β· Features Β· vs. Wispr Flow / Dragon Β· Voice Commands Β· Contributing
Want a real screen-recording demo here? See docs/demo-recording.md β drop a docs/demo.gif in and uncomment the line below.
Whisper Local is a free, open-source, fully offline alternative to Wispr Flow, Dragon, and Otter for power users who want AI dictation without sending audio to the cloud. Built on faster-whisper (CTranslate2), it delivers push-to-talk speech-to-text in any application β chat apps, code editors, browsers, terminals, design tools, anywhere a cursor blinks. Self-hosted, hackable, MIT-licensed.
Looking for: Wispr Flow alternative, offline voice typing, local Whisper dictation, free Dragon NaturallySpeaking alternative, privacy-first speech-to-text, Windows voice dictation without cloud, macOS push-to-talk transcription. You found it.
Most AI dictation tools are great β until you check the privacy policy. Your audio goes to a server, gets processed, and (sometimes) stored. You pay a monthly fee or get cut off.
Whisper Local exists because you shouldn't have to choose between accuracy and privacy.
- π Your voice never leaves your machine β not even metadata
- π Free forever β no account, no API key, no subscription
- π Works offline, air-gapped, after the internet is gone
- π οΈ Fork it, hack it, ship your own version β MIT licensed
- π‘ Same Whisper model quality as cloud services, running on your own GPU
This is a community tool, not a product. There's no support SLA, no roadmap committee, no marketing. If it's useful to you, great. If something's broken, PRs are welcome.
A note from the maintainer: I built this for myself, then realised it might help others. So I'm releasing it for anyone who wants it β no strings attached. Use it. Fork it. Rebrand it. Ship your own version. The only thing I ask is that you keep the LICENSE attribution intact (to Pin Wang, the original upstream author, and to me as the fork maintainer). If you build something cool on top of it, I'd love to hear about it via a Discussion β but you don't owe anyone anything.
β Rohit Burani
| Feature | Whisper Local | Wispr Flow | Dragon / Dragon Anywhere | Otter.ai | Windows Speech Recognition |
|---|---|---|---|---|---|
| Runs 100% offline | β | β | β (Anywhere) | β | β |
| Audio never leaves your machine | β | β | β | β | partial |
| Free / open source | β | β | β ($$$/yr) | β ($$/mo) | β |
| Modern AI accuracy (Whisper) | β | β | partial | β | β |
| Works in any app via hotkey | β | β | partial | β | partial |
| Customisable voice commands | β | partial | β | β | β |
| Push-to-talk + auto-paste + auto-send | β | β | partial | β | β |
| GPU acceleration (NVIDIA & AMD) | β | n/a | n/a | n/a | β |
| AI rephrase / transforms (Ollama) | β | β | β | β | β |
| Hackable / MIT licensed | β | β | β | β | β |
| No account required | β | β | β | β | β |
- ποΈ Global push-to-talk hotkey β start recording from any app with
Ctrl+Win(Windows) orFn+Ctrl(macOS) - β‘ Pre-roll buffer + warmup β captures the 500 ms before you press the key and pre-loads Whisper at boot, so the first word is never clipped and the first recording feels instant
- π΅ Floating level overlay β a small pill at the screen edge shows you're being heard, with the transcript appearing next to the level bar (Wispr Flowβstyle). Optional real-time streaming preview shows words as you speak.
- π Inline voice formatting β say "comma", "period", "question mark", "new paragraph", "open quote", etc. mid-sentence
- π€ AI rephrase β dedicated
Ctrl+Shift+Winhotkey: select text, hold, speak your instruction, release β local Ollama rewrites it in place - π Translation mode β speak any language, get English; tray β Profile β Translate
- π Continuous dictation mode β for long-form notes, the app auto-restarts recording after each delivery
- π Fallback window β if no text field is focused, the transcript appears in a small window (pre-selected, copy button, already on clipboard)
- βΈ Pause-all hotkey β
Ctrl+Alt+Windisables every Whisper Local hotkey until you press it again - π Auto-paste at cursor β transcript lands wherever you're typing, optionally followed by Enter (auto-send)
- π 100 % local & private β no network calls during use; Whisper models cached on disk
- π GPU acceleration β NVIDIA CUDA and AMD ROCm supported, CPU works out of the box
- π£οΈ Voice commands β say a trigger phrase to send a hotkey, type pre-written text, or run a shell command
- π Hot-reload β edit
commands.yamland your change applies on the next transcription, no restart - π©Ί Built-in diagnostics β
whisper-local --doctorchecks audio devices, model cache, hotkeys, and recent errors - ποΈ Profiles β switch between Dictation / Chat / Code / Notes presets from the tray
- πͺ Per-app rules β different behaviour per foreground app (auto-send in Slack, copy-only in VS Code, suppress in 1Password)
- π§Ή Optional LLM cleanup β pipe transcripts through a local Ollama model for punctuation / capitalisation polish (off by default, fully local)
- π Recent transcriptions β last 10 results in the tray menu, click to copy back
- π§ Settings backup/restore β
--export-settings/--import-settingsfor portability - π₯οΈ Settings UI β
whisper-local --settingsopens a GUI settings window (no YAML editing required) - π Transcript history β
whisper-local --historyopens a searchable log of everything you've dictated - π Opt-in update notifications β daily GitHub release check, fully offline by default (
update_check.enabled: trueto opt in) - ποΈ Noise suppression β spectral gating via
noisereduce, off by default (pip install 'whisper-local[noise]') - π©Ί
--selftestβ one-command sanity check (mic, model, transcription, clipboard) β perfect for first-launch - π― Hotkey cheat sheet β
whisper-local --cheat-sheetor tray menu β shows your current configured hotkeys at a glance - π¦
--bundle-logsβ zip up redacted logs + diagnostics for bug reports with one command - π Local OpenAI-compatible API β
whisper-local --serveexposesPOST /v1/audio/transcriptionsonlocalhost:7777for Cursor, Open WebUI, anything that speaks OpenAI Whisper API - π‘οΈ Auto-recovery β silently reconnects when a USB mic is unplugged mid-recording
- π‘οΈ Crash reports β uncaught errors write a self-contained dump to disk
- πͺ System tray UI β model selection, mic selection, profile switch, diagnostics
- π Cross-platform β Windows 10+, macOS
git clone https://github.com/drajb/whisper-local.git
cd whisper-local
pip install -e .| Terminal | whisper-local (or wl for short) |
| Double-click | whisper-local.cmd (Windows) |
| Start on login | Tick Start on login in the tray menu (or the first-run welcome), or run whisper-local --enable-autostart. Disable anytime the same way. |
First launch downloads the default base Whisper model (~141 MB) into your HuggingFace cache. After that, everything runs offline. (Prefer a smaller/faster download? Set whisper.model: tiny β ~75 MB.)
| Action | Windows | macOS |
|---|---|---|
| Hold to record | Ctrl+Win |
Fn+Ctrl |
| Stop & paste | release key (push-to-talk) or Ctrl |
release or Fn |
| Stop & auto-send (Enter) | Alt |
Option |
| Cancel | Esc |
Shift |
| Voice command mode | Alt+Win |
Fn+Command |
whisper-local --doctorRuns through Python version, dependencies, config validation, audio devices, model cache, hotkey backend, and recent log errors. Exit 0 = clean.
Speak a trigger to run keyboard shortcuts, type snippets, or launch programs. Defined in:
- Windows:
%APPDATA%\whisperkey\commands.yaml - macOS:
~/.whisperkey/commands.yaml
commands:
# Send a keyboard shortcut
- trigger: "undo"
hotkey: "ctrl+z"
# Deliver pre-written text
- trigger: "my email"
type: "user@example.com"
# Run a shell command
- trigger: "open notepad"
run: 'notepad.exe'Edits hot-reload β no app restart required. See docs/voice-commands.md for the full guide.
β οΈ Voice commands withrun:execute through your system shell with your user privileges. Only add commands you trust.
On first launch, Whisper Local detects your GPU and offers one-press install of the required runtime libraries. Supports NVIDIA CUDA and AMD ROCm.
For manual setup or AMD RDNA 1, see docs/gpu-setup.md.
Whisper Local doubles as a drop-in local replacement for the OpenAI Whisper API β fully offline. Point any tool that speaks POST /v1/audio/transcriptions at it (Cursor, VS Code Continue, Open WebUI, n8n, custom scripts, anything else).
whisper-local --serve # listens on http://127.0.0.1:7777
whisper-local --serve --serve-port 8080# Drop-in compatible with the OpenAI SDK:
curl -X POST http://127.0.0.1:7777/v1/audio/transcriptions \
-F file=@audio.wav -F model=whisper-1 -F response_format=textSame Whisper model you use for dictation. Same GPU. No API key. No rate limit. No outgoing traffic.
Switch between presets from the tray icon β Profile:
| Profile | Behaviour |
|---|---|
| Dictation | General-purpose voice typing, auto-paste on |
| Chat | Push-to-talk, auto-paste + auto-send via Alt |
| Code | Copy-only mode for editors, never auto-sends |
| Notes | Quiet copy-to-clipboard, voice commands disabled |
Edit or add new profiles in %APPDATA%\whisperkey\profiles.yaml.
Different apps want different behaviour. Whisper Local detects the
foreground window before delivering each transcription and matches it
against rules in %APPDATA%\whisperkey\app_rules.yaml:
rules:
# Chat apps: send the message immediately
- match: ["slack.exe", "discord.exe"]
auto_send: true
# Code editors: never auto-send, copy only
- match: ["code.exe", "cursor.exe"]
auto_paste: false
# Password managers: skip delivery entirely
- match: ["1password.exe", "bitwarden.exe"]
suppress: trueHot-reloads β edit and the next transcription picks it up.
If you have Ollama running locally, Whisper Local can
pipe each transcript through a small local model for punctuation and
capitalisation polish. Off by default and fully local β set
postprocess.ollama.enabled: true in user_settings.yaml to enable.
postprocess:
capitalize_first: true # works without Ollama
ensure_punctuation: true # works without Ollama
strip_filler_words: true # works without Ollama
ollama:
enabled: false # set true to opt in
endpoint: http://localhost:11434
model: llama3.2
timeout: 5Local settings live at:
- Windows:
%APPDATA%\whisperkey\user_settings.yaml - macOS:
~/.whisperkey/user_settings.yaml
Delete the file and restart to reset to defaults. Highlights:
| Option | Default | Notes |
|---|---|---|
whisper.model |
base |
Any model from whisper.models. tiny = smallest/fastest, larger = more accurate/slower |
whisper.device |
cpu |
cpu or cuda (NVIDIA/AMD) |
whisper.compute_type |
int8 |
int8/float16/float32 |
whisper.language |
auto |
Auto-detect or specific language code |
whisper.hotwords |
[] |
Words the model should favour β names, jargon |
hotkey.recording_hotkey |
ctrl+win |
Configurable |
hotkey.recording_mode |
push_to_talk |
push_to_talk (hold to talk) or toggle |
vad.vad_realtime_enabled |
true |
Auto-stop on silence |
clipboard.auto_paste |
true |
false = copy only |
clipboard.delivery_method |
paste |
paste (Ctrl+V) or type (direct injection) |
voice_commands.enabled |
true |
Enable command mode |
audio.host |
null |
WASAPI recommended on Windows for low latency |
Full reference: config.defaults.yaml.
whisper-local # Run the app (or use `wl`)
whisper-local --setup # Interactive setup wizard (model, mode, mic)
whisper-local --doctor # Run diagnostics
whisper-local --stats # Transcription history & time saved
whisper-local --version # Print version
whisper-local --quit # Stop the running instance
whisper-local --export-settings DIR # Back up user_settings + commands
whisper-local --import-settings DIR # Restore from a backup
whisper-local --export-transcripts FILE # Dump history (.txt/.md/.csv)
whisper-local --import-vocab FOLDER # Mine a folder for hotwords
whisper-local --settings # Open the settings GUI (no YAML editing required)
whisper-local --history # Browse and search transcript history
whisper-local --cheat-sheet # Show your currently configured hotkeys
whisper-local --selftest # Run an automated self-test (mic, model, transcription)
whisper-local --bundle-logs # Create a redacted diagnostic zip for bug reports
whisper-local --serve # Run a local OpenAI-compatible Whisper API on :7777
whisper-local --enable-autostart # Launch automatically at login (--disable-autostart to undo)
whisper-local --test # Run a separate test instance (own mutex)Launching while an instance is already running takes over β the old one is replaced cleanly, no manual quit needed.
βββββββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββββββ
β global-hotkeys / β β sounddevice + β β faster-whisper / β
β NSEvent (macOS) βββΆβ 500ms ring buf βββΆβ ctranslate2 (GPU) β
βββββββββββββββββββββββ β + TEN VAD β ββββββββββββ¬βββββββββββ
ββββββββββββββββββββ β
βΌ
ββββββββββββββββββββ βββββββββββββββββββββββ
β Voice command ββββ Transcribed text β
β matcher β β β
ββββββββββββββββββββ ββββββββββββ¬βββββββββββ
βΌ
βββββββββββββββββββββββ
β ctypes SendInput / β
β Quartz CGEvent β
β β cursor β
βββββββββββββββββββββββ
Whisper Local makes the following network calls and no others:
- First launch only: downloads the Whisper model from
huggingface.cointo your local cache. - GPU onboarding (opt-in): if you accept the GPU setup prompt,
pip installpulls CUDA / ROCm runtime packages from PyPI /repo.radeon.com.
After setup, zero network traffic. Confirm by running whisper-local --doctor and inspecting the source β every network entry point lives in onboarding.py and is gated behind explicit user prompts.
faster-whisper Β· ctranslate2 Β· sounddevice Β· ten-vad Β· pyperclip Β· pystray Β· ruamel.yaml Β· playsound3
Windows-only: global-hotkeys Β· pywin32 Β· ctypes SendInput
macOS-only: pyobjc-framework-Quartz Β· pyobjc-framework-ApplicationServices
- docs/troubleshooting.md β symptom β cause β fix table for the most common issues
- docs/faq.md β privacy, comparisons (Whisper.cpp / WSR / Wispr Flow / Dragon), model picks, GPU notes
- docs/distribution.md β how releases work (standalone
.exe, PyPI, winget, Homebrew) and how to ship one - docs/voice-commands.md β the full voice command DSL
- docs/gpu-setup.md β manual GPU setup for NVIDIA / AMD
- CHANGELOG.md β release notes
Hit a wall? Run whisper-local --doctor or whisper-local --selftest first β they catch 90% of issues.
Contributions of all kinds are welcome β bug fixes, new features, docs improvements, or just opening an issue with a clear reproduction. This project is maintained on a best-effort basis with no SLA; please be patient with response times.
git clone https://github.com/drajb/whisper-local.git
pip install -e .
python -m unittest tests.test_smoke # smoke suite β should report OKSee CONTRIBUTING.md for the full guide and CODE_OF_CONDUCT.md for community standards. By contributing you agree your code will be MIT licensed. Found a security issue? See SECURITY.md β please don't open a public issue.
Good first issues are tagged here. The full credit list is in AUTHORS.md.
Whisper Local is free and always will be. If it saves you time or a monthly subscription, consider starring the repo and sharing it with people who'd find it useful β it helps the project grow.
No pressure. Starring the repo and sharing it with people who'd find it useful is just as helpful.
Forked from whisper-key-local by Pin Wang β huge thanks to the original work that made this fork possible. The full list of credits, including every open-source library Whisper Local builds on, is in AUTHORS.md.
MIT licensed; original copyright preserved in LICENSE.
β If you find this useful, please star the repo β it helps others discover it.
Maintained by Rohit Burani (@drajb)
Website Β· GitHub Β· Discussions Β· Report a bug Β· Request a feature
Tags: whisper Β· dictation Β· speech-to-text Β· voice-typing Β· transcription Β· ai-dictation Β· local-ai Β· offline Β· push-to-talk Β· voice-recognition Β· accessibility Β· faster-whisper Β· privacy Β· self-hosted Β· wispr-flow-alternative Β· dragon-naturallyspeaking-alternative Β· otter-alternative Β· ollama Β· voice-commands Β· windows Β· macos Β· python