Whisper Local

Free, Open-Source, 100% Offline AI Dictation for Windows & macOS

Press a hotkey. Speak. Your words appear at the cursor. No cloud. No subscription. No telemetry. Powered by OpenAI Whisper.

Quick Start · Features · vs. Wispr Flow / Dragon · Voice Commands · Contributing

_{Want a real screen-recording demo here? See docs/demo-recording.md — drop a docs/demo.gif in and uncomment the line below.}

Whisper Local is a free, open-source, fully offline alternative to Wispr Flow, Dragon, and Otter for power users who want AI dictation without sending audio to the cloud. Built on faster-whisper (CTranslate2), it delivers push-to-talk speech-to-text in any application — chat apps, code editors, browsers, terminals, design tools, anywhere a cursor blinks. Self-hosted, hackable, MIT-licensed.

Looking for: Wispr Flow alternative, offline voice typing, local Whisper dictation, free Dragon NaturallySpeaking alternative, privacy-first speech-to-text, Windows voice dictation without cloud, macOS push-to-talk transcription. You found it.

🌟 Why this exists

Most AI dictation tools are great — until you check the privacy policy. Your audio goes to a server, gets processed, and (sometimes) stored. You pay a monthly fee or get cut off.

Whisper Local exists because you shouldn't have to choose between accuracy and privacy.

🔒 Your voice never leaves your machine — not even metadata
🆓 Free forever — no account, no API key, no subscription
🔌 Works offline, air-gapped, after the internet is gone
🛠️ Fork it, hack it, ship your own version — MIT licensed
💡 Same Whisper model quality as cloud services, running on your own GPU

This is a community tool, not a product. There's no support SLA, no roadmap committee, no marketing. If it's useful to you, great. If something's broken, PRs are welcome.

A note from the maintainer: I built this for myself, then realised it might help others. So I'm releasing it for anyone who wants it — no strings attached. Use it. Fork it. Rebrand it. Ship your own version. The only thing I ask is that you keep the LICENSE attribution intact (to Pin Wang, the original upstream author, and to me as the fork maintainer). If you build something cool on top of it, I'd love to hear about it via a Discussion — but you don't owe anyone anything.

— Rohit Burani

✨ Why Whisper Local?

Feature	Whisper Local	Wispr Flow	Dragon / Dragon Anywhere	Otter.ai	Windows Speech Recognition
Runs 100% offline	✅	❌	❌ (Anywhere)	❌	✅
Audio never leaves your machine	✅	❌	❌	❌	partial
Free / open source	✅	❌	❌ ($$$/yr)	❌ ($$/mo)	✅
Modern AI accuracy (Whisper)	✅	✅	partial	✅	❌
Works in any app via hotkey	✅	✅	partial	❌	partial
Customisable voice commands	✅	partial	✅	❌	❌
Push-to-talk + auto-paste + auto-send	✅	✅	partial	❌	❌
GPU acceleration (NVIDIA & AMD)	✅	n/a	n/a	n/a	❌
AI rephrase / transforms (Ollama)	✅	✅	❌	❌	❌
Hackable / MIT licensed	✅	❌	❌	❌	❌
No account required	✅	❌	❌	❌	✅

🎯 Features

🎙️ Global push-to-talk hotkey — start recording from any app with Ctrl+Win (Windows) or Fn+Ctrl (macOS)
⚡ Pre-roll buffer + warmup — captures the 500 ms before you press the key and pre-loads Whisper at boot, so the first word is never clipped and the first recording feels instant
🔵 Floating level overlay — a small pill at the screen edge shows you're being heard, with the transcript appearing next to the level bar (Wispr Flow–style). Optional real-time streaming preview shows words as you speak.
📝 Inline voice formatting — say "comma", "period", "question mark", "new paragraph", "open quote", etc. mid-sentence
🤖 AI rephrase — dedicated Ctrl+Shift+Win hotkey: select text, hold, speak your instruction, release — local Ollama rewrites it in place
🌐 Translation mode — speak any language, get English; tray → Profile → Translate
🔁 Continuous dictation mode — for long-form notes, the app auto-restarts recording after each delivery
📋 Fallback window — if no text field is focused, the transcript appears in a small window (pre-selected, copy button, already on clipboard)
⏸ Pause-all hotkey — Ctrl+Alt+Win disables every Whisper Local hotkey until you press it again
📋 Auto-paste at cursor — transcript lands wherever you're typing, optionally followed by Enter (auto-send)
🔒 100 % local & private — no network calls during use; Whisper models cached on disk
🚀 GPU acceleration — NVIDIA CUDA and AMD ROCm supported, CPU works out of the box
🗣️ Voice commands — say a trigger phrase to send a hotkey, type pre-written text, or run a shell command
🔁 Hot-reload — edit commands.yaml and your change applies on the next transcription, no restart
🩺 Built-in diagnostics — whisper-local --doctor checks audio devices, model cache, hotkeys, and recent errors
🎛️ Profiles — switch between Dictation / Chat / Code / Notes presets from the tray
🪟 Per-app rules — different behaviour per foreground app (auto-send in Slack, copy-only in VS Code, suppress in 1Password)
🧹 Optional LLM cleanup — pipe transcripts through a local Ollama model for punctuation / capitalisation polish (off by default, fully local)
📜 Recent transcriptions — last 10 results in the tray menu, click to copy back
🔧 Settings backup/restore — --export-settings / --import-settings for portability
🖥️ Settings UI — whisper-local --settings opens a GUI settings window (no YAML editing required)
📜 Transcript history — whisper-local --history opens a searchable log of everything you've dictated
🔔 Opt-in update notifications — daily GitHub release check, fully offline by default (update_check.enabled: true to opt in)
🎚️ Noise suppression — spectral gating via noisereduce, off by default (pip install 'whisper-local[noise]')
🩺 --selftest — one-command sanity check (mic, model, transcription, clipboard) — perfect for first-launch
🎯 Hotkey cheat sheet — whisper-local --cheat-sheet or tray menu — shows your current configured hotkeys at a glance
📦 --bundle-logs — zip up redacted logs + diagnostics for bug reports with one command
🌐 Local OpenAI-compatible API — whisper-local --serve exposes POST /v1/audio/transcriptions on localhost:7777 for Cursor, Open WebUI, anything that speaks OpenAI Whisper API
🛡️ Auto-recovery — silently reconnects when a USB mic is unplugged mid-recording
🛡️ Crash reports — uncaught errors write a self-contained dump to disk
🪟 System tray UI — model selection, mic selection, profile switch, diagnostics
🍎 Cross-platform — Windows 10+, macOS

🚀 Quick Start

Install (Python 3.11–3.13)

git clone https://github.com/drajb/whisper-local.git
cd whisper-local
pip install -e .

Launch


Terminal	`whisper-local` (or `wl` for short)
Double-click	`whisper-local.cmd` (Windows)
Start on login	Tick Start on login in the tray menu (or the first-run welcome), or run `whisper-local --enable-autostart`. Disable anytime the same way.

First launch downloads the default base Whisper model (~141 MB) into your HuggingFace cache. After that, everything runs offline. (Prefer a smaller/faster download? Set whisper.model: tiny — ~75 MB.)

Use it

Action	Windows	macOS
Hold to record	`Ctrl+Win`	`Fn+Ctrl`
Stop & paste	release key (push-to-talk) or `Ctrl`	release or `Fn`
Stop & auto-send (Enter)	`Alt`	`Option`
Cancel	`Esc`	`Shift`
Voice command mode	`Alt+Win`	`Fn+Command`

Verify everything works

whisper-local --doctor

Runs through Python version, dependencies, config validation, audio devices, model cache, hotkey backend, and recent log errors. Exit 0 = clean.

🗣️ Voice Commands

Speak a trigger to run keyboard shortcuts, type snippets, or launch programs. Defined in:

Windows: %APPDATA%\whisperkey\commands.yaml
macOS: ~/.whisperkey/commands.yaml

commands:
  # Send a keyboard shortcut
  - trigger: "undo"
    hotkey: "ctrl+z"

  # Deliver pre-written text
  - trigger: "my email"
    type: "user@example.com"

  # Run a shell command
  - trigger: "open notepad"
    run: 'notepad.exe'

Edits hot-reload — no app restart required. See docs/voice-commands.md for the full guide.

⚠️ Voice commands with run: execute through your system shell with your user privileges. Only add commands you trust.

⚡ GPU Acceleration

On first launch, Whisper Local detects your GPU and offers one-press install of the required runtime libraries. Supports NVIDIA CUDA and AMD ROCm.

For manual setup or AMD RDNA 1, see docs/gpu-setup.md.

🌐 Local OpenAI-Compatible API

Whisper Local doubles as a drop-in local replacement for the OpenAI Whisper API — fully offline. Point any tool that speaks POST /v1/audio/transcriptions at it (Cursor, VS Code Continue, Open WebUI, n8n, custom scripts, anything else).

whisper-local --serve            # listens on http://127.0.0.1:7777
whisper-local --serve --serve-port 8080

# Drop-in compatible with the OpenAI SDK:
curl -X POST http://127.0.0.1:7777/v1/audio/transcriptions \
  -F file=@audio.wav -F model=whisper-1 -F response_format=text

Same Whisper model you use for dictation. Same GPU. No API key. No rate limit. No outgoing traffic.

🎛️ Profiles

Switch between presets from the tray icon → Profile:

Profile	Behaviour
Dictation	General-purpose voice typing, auto-paste on
Chat	Push-to-talk, auto-paste + auto-send via `Alt`
Code	Copy-only mode for editors, never auto-sends
Notes	Quiet copy-to-clipboard, voice commands disabled

Edit or add new profiles in %APPDATA%\whisperkey\profiles.yaml.

🪟 Per-app rules

Different apps want different behaviour. Whisper Local detects the foreground window before delivering each transcription and matches it against rules in %APPDATA%\whisperkey\app_rules.yaml:

rules:
  # Chat apps: send the message immediately
  - match: ["slack.exe", "discord.exe"]
    auto_send: true

  # Code editors: never auto-send, copy only
  - match: ["code.exe", "cursor.exe"]
    auto_paste: false

  # Password managers: skip delivery entirely
  - match: ["1password.exe", "bitwarden.exe"]
    suppress: true

Hot-reloads — edit and the next transcription picks it up.

🧹 Optional LLM cleanup

If you have Ollama running locally, Whisper Local can pipe each transcript through a small local model for punctuation and capitalisation polish. Off by default and fully local — set postprocess.ollama.enabled: true in user_settings.yaml to enable.

postprocess:
  capitalize_first: true        # works without Ollama
  ensure_punctuation: true      # works without Ollama
  strip_filler_words: true      # works without Ollama
  ollama:
    enabled: false              # set true to opt in
    endpoint: http://localhost:11434
    model: llama3.2
    timeout: 5

⚙️ Configuration

Local settings live at:

Windows: %APPDATA%\whisperkey\user_settings.yaml
macOS: ~/.whisperkey/user_settings.yaml

Delete the file and restart to reset to defaults. Highlights:

Option	Default	Notes
`whisper.model`	`base`	Any model from `whisper.models`. `tiny` = smallest/fastest, larger = more accurate/slower
`whisper.device`	`cpu`	`cpu` or `cuda` (NVIDIA/AMD)
`whisper.compute_type`	`int8`	`int8`/`float16`/`float32`
`whisper.language`	`auto`	Auto-detect or specific language code
`whisper.hotwords`	`[]`	Words the model should favour — names, jargon
`hotkey.recording_hotkey`	`ctrl+win`	Configurable
`hotkey.recording_mode`	`push_to_talk`	`push_to_talk` (hold to talk) or `toggle`
`vad.vad_realtime_enabled`	`true`	Auto-stop on silence
`clipboard.auto_paste`	`true`	`false` = copy only
`clipboard.delivery_method`	`paste`	`paste` (Ctrl+V) or `type` (direct injection)
`voice_commands.enabled`	`true`	Enable command mode
`audio.host`	`null`	`WASAPI` recommended on Windows for low latency

Full reference: config.defaults.yaml.

🛠️ CLI Reference

whisper-local                      # Run the app (or use `wl`)
whisper-local --setup              # Interactive setup wizard (model, mode, mic)
whisper-local --doctor             # Run diagnostics
whisper-local --stats              # Transcription history & time saved
whisper-local --version            # Print version
whisper-local --quit               # Stop the running instance
whisper-local --export-settings DIR        # Back up user_settings + commands
whisper-local --import-settings DIR        # Restore from a backup
whisper-local --export-transcripts FILE    # Dump history (.txt/.md/.csv)
whisper-local --import-vocab FOLDER        # Mine a folder for hotwords
whisper-local --settings           # Open the settings GUI (no YAML editing required)
whisper-local --history            # Browse and search transcript history
whisper-local --cheat-sheet        # Show your currently configured hotkeys
whisper-local --selftest           # Run an automated self-test (mic, model, transcription)
whisper-local --bundle-logs        # Create a redacted diagnostic zip for bug reports
whisper-local --serve              # Run a local OpenAI-compatible Whisper API on :7777
whisper-local --enable-autostart   # Launch automatically at login (--disable-autostart to undo)
whisper-local --test               # Run a separate test instance (own mutex)

Launching while an instance is already running takes over — the old one is replaced cleanly, no manual quit needed.

🏗️ How it works

┌─────────────────────┐  ┌──────────────────┐  ┌─────────────────────┐
│  global-hotkeys /   │  │   sounddevice +  │  │  faster-whisper /   │
│  NSEvent (macOS)    │─▶│  500ms ring buf  │─▶│  ctranslate2 (GPU)  │
└─────────────────────┘  │  + TEN VAD       │  └──────────┬──────────┘
                         └──────────────────┘             │
                                                          ▼
                         ┌──────────────────┐  ┌─────────────────────┐
                         │  Voice command   │◀─│  Transcribed text   │
                         │  matcher         │  │                     │
                         └──────────────────┘  └──────────┬──────────┘
                                                          ▼
                                                ┌─────────────────────┐
                                                │  ctypes SendInput / │
                                                │  Quartz CGEvent     │
                                                │  → cursor           │
                                                └─────────────────────┘

🔒 Privacy pledge

Whisper Local makes the following network calls and no others:

First launch only: downloads the Whisper model from huggingface.co into your local cache.
GPU onboarding (opt-in): if you accept the GPU setup prompt, pip install pulls CUDA / ROCm runtime packages from PyPI / repo.radeon.com.

After setup, zero network traffic. Confirm by running whisper-local --doctor and inspecting the source — every network entry point lives in onboarding.py and is gated behind explicit user prompts.

📦 Tech stack

faster-whisper · ctranslate2 · sounddevice · ten-vad · pyperclip · pystray · ruamel.yaml · playsound3 Windows-only: global-hotkeys · pywin32 · ctypes SendInput macOS-only: pyobjc-framework-Quartz · pyobjc-framework-ApplicationServices

📚 Documentation

docs/troubleshooting.md — symptom → cause → fix table for the most common issues
docs/faq.md — privacy, comparisons (Whisper.cpp / WSR / Wispr Flow / Dragon), model picks, GPU notes
docs/distribution.md — how releases work (standalone .exe, PyPI, winget, Homebrew) and how to ship one
docs/voice-commands.md — the full voice command DSL
docs/gpu-setup.md — manual GPU setup for NVIDIA / AMD
CHANGELOG.md — release notes

Hit a wall? Run whisper-local --doctor or whisper-local --selftest first — they catch 90% of issues.

🤝 Contributing

Contributions of all kinds are welcome — bug fixes, new features, docs improvements, or just opening an issue with a clear reproduction. This project is maintained on a best-effort basis with no SLA; please be patient with response times.

git clone https://github.com/drajb/whisper-local.git
pip install -e .
python -m unittest tests.test_smoke   # smoke suite — should report OK

See CONTRIBUTING.md for the full guide and CODE_OF_CONDUCT.md for community standards. By contributing you agree your code will be MIT licensed. Found a security issue? See SECURITY.md — please don't open a public issue.

Good first issues are tagged here. The full credit list is in AUTHORS.md.

☕ Support

Whisper Local is free and always will be. If it saves you time or a monthly subscription, consider starring the repo and sharing it with people who'd find it useful — it helps the project grow.

No pressure. Starring the repo and sharing it with people who'd find it useful is just as helpful.

🙏 Credit

Forked from whisper-key-local by Pin Wang — huge thanks to the original work that made this fork possible. The full list of credits, including every open-source library Whisper Local builds on, is in AUTHORS.md.

MIT licensed; original copyright preserved in LICENSE.

⭐ If you find this useful, please star the repo — it helps others discover it.

Maintained by Rohit Burani (@drajb)

Website · GitHub · Discussions · Report a bug · Request a feature

_{Tags: whisper · dictation · speech-to-text · voice-typing · transcription · ai-dictation · local-ai · offline · push-to-talk · voice-recognition · accessibility · faster-whisper · privacy · self-hosted · wispr-flow-alternative · dragon-naturallyspeaking-alternative · otter-alternative · ollama · voice-commands · windows · macos · python}

Name		Name	Last commit message	Last commit date
Latest commit History 516 Commits
.claude		.claude
.github		.github
docs		docs
packaging		packaging
pyapp-build		pyapp-build
src/whisper_key		src/whisper_key
tests		tests
tools		tools
.editorconfig		.editorconfig
.gitignore		.gitignore
.python-version		.python-version
AUTHORS.md		AUTHORS.md
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
whisper-local.cmd		whisper-local.cmd
whisper-local.py		whisper-local.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper Local

Free, Open-Source, 100% Offline AI Dictation for Windows & macOS

🌟 Why this exists

✨ Why Whisper Local?

🎯 Features

🚀 Quick Start

Install (Python 3.11–3.13)

Launch

Use it

Verify everything works

🗣️ Voice Commands

⚡ GPU Acceleration

🌐 Local OpenAI-Compatible API

🎛️ Profiles

🪟 Per-app rules

🧹 Optional LLM cleanup

⚙️ Configuration

🛠️ CLI Reference

🏗️ How it works

🔒 Privacy pledge

📦 Tech stack

📚 Documentation

🤝 Contributing

☕ Support

🙏 Credit

⭐ If you find this useful, please star the repo — it helps others discover it.

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Whisper Local

Free, Open-Source, 100% Offline AI Dictation for Windows & macOS

🌟 Why this exists

✨ Why Whisper Local?

🎯 Features

🚀 Quick Start

Install (Python 3.11–3.13)

Launch

Use it

Verify everything works

🗣️ Voice Commands

⚡ GPU Acceleration

🌐 Local OpenAI-Compatible API

🎛️ Profiles

🪟 Per-app rules

🧹 Optional LLM cleanup

⚙️ Configuration

🛠️ CLI Reference

🏗️ How it works

🔒 Privacy pledge

📦 Tech stack

📚 Documentation

🤝 Contributing

☕ Support

🙏 Credit

⭐ If you find this useful, please star the repo — it helps others discover it.

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages