Open-source, local-first voice dictation for your desktop.
Hold a hotkey, speak, release β your words are transcribed on-device
with Whisper and typed straight into whatever app you're using.
Download Β· Quick start Β· Build from source Β· Troubleshooting Β· Contributing Β· License
- What it is
- Why Wisper
- Features
- Quick start
- Install
- Permissions
- Usage
- Configuration
- Models
- Languages
- How it works
- Privacy
- Build from source
- Project layout
- Tech stack
- Updates & releases
- Troubleshooting
- Contributing
- License
- Acknowledgements
Wisper is a free, open alternative to cloud dictation tools like Wispr Flow. Everything runs on your machine β no cloud, no account, no telemetry, no subscription. Your audio never leaves your computer; transcription happens entirely on-device via whisper.cpp, with Metal GPU acceleration on Apple Silicon.
It lives in your system tray as a small floating "pill" and stays out of the way until you press your hotkey.
| Wisper | Typical cloud dictation | |
|---|---|---|
| Where audio goes | Stays on your device | Uploaded to a server |
| Account required | No | Usually yes |
| Cost | Free & open source (MIT) | Subscription |
| Works offline | Yes | No |
| Telemetry | None | Common |
| Languages | 99 (Whisper) | Varies |
| Customizable | Source is yours | Closed |
- ποΈ Push-to-talk & hands-free β hold the hotkey to dictate while held, or double-tap to keep recording without holding; a later single press stops and inserts.
- β¨οΈ Any hotkey you want β bind a combo (
Ctrl+Shift+Z) or a single modifier on its own (justOption,Ctrl, orShift, push-to-talk style). - π 100% local β on-device Whisper inference; nothing is ever sent anywhere.
- β‘ GPU-accelerated β Metal on Apple Silicon for near-instant transcription.
- π 99 transcription languages β pick one or let Whisper auto-detect.
- π£οΈ Localized interface β UI, tray menu, and error messages in 15 languages (English, PortuguΓͺs, EspaΓ±ol, FranΓ§ais, Deutsch, Italiano, Nederlands, Π ΡΡΡΠΊΠΈΠΉ, Polski, TΓΌrkΓ§e, ζ₯ζ¬θͺ, νκ΅μ΄, δΈζ, Ψ§ΩΨΉΨ±Ψ¨ΩΨ©, ΰ€Ήΰ€Ώΰ€¨ΰ₯ΰ€¦ΰ₯).
- π¦ Built-in model manager β download, switch, and remove Whisper models from inside the app, with live progress and cancel.
- π Custom dictionary β teach it names and jargon so they're spelled right.
- π Text replacements β auto-rewrite snippets in every transcript
(e.g.
omwβon my way). - π Insights & history β a local, honest log of what you dictated with words-per-minute and daily stats. Nothing fabricated, nothing uploaded.
- π±οΈ Two injection modes β synthetic keystrokes or clipboard paste.
- π Dictation sounds & music ducking β optional start/stop cues; optionally mute playing music while you talk.
- πͺ Tray-native β closing the window keeps it running; quit from the tray.
- π Launch at login and optional menu-bar-only (hide the Dock icon).
- π Light & dark theme.
- π Automatic updates β signed over-the-air updates via GitHub Releases.
- Download and install for your OS.
- Launch it and complete the short onboarding.
- Grant Microphone (and on macOS, Accessibility) permission β see Permissions.
- Download a model (start with base) and pick your language.
- Anywhere you can type, hold your hotkey, speak, release β your words appear in the focused app.
Grab the installer for your platform from the latest release:
| Platform | Asset |
|---|---|
| macOS (Apple Silicon / Intel) | .dmg |
| Windows | .msi / .exe |
| Linux | .AppImage / .deb |
Note
macOS builds are ad-hoc signed (no paid Developer ID yet). On first launch, right-click the app β Open, or allow it under System Settings β Privacy & Security. After that it opens normally and updates itself.
Wisper needs OS-level permissions to hear you and to type for you:
| Permission | Why | Where |
|---|---|---|
| Microphone | Capture your voice | macOS/Win/Linux prompt on first record |
| Accessibility (macOS) | Insert text into other apps and read the global hotkey | System Settings β Privacy & Security β Accessibility |
On macOS, after an app update the system can occasionally drop the Accessibility grant. Wisper detects this and re-prompts; if text stops inserting after an update, re-enable it under Accessibility (see Troubleshooting).
- Open Settings from the tray icon.
- Download a model β start with base for a good size/quality balance; use small or large-v3-turbo for higher accuracy, especially in non-English.
- Choose your language (or Detect automatically) and set your hotkey (a combo or a single modifier).
- Dictate:
- Push-to-talk β hold the hotkey, speak, release.
- Hands-free β double-tap the hotkey to start, single-press to stop.
- The floating pill shows recording state and a quick language switcher. Closing the main window keeps Wisper running in the tray.
Everything is in Settings, persisted to a local config.toml:
| Setting | What it does |
|---|---|
| Hotkey | Click to capture any combo, or a lone modifier (Option/Ctrl/Shift). |
| Model | Active Whisper model; manage downloads here. |
| Language | Transcription language, or auto-detect. |
| Interface language | UI, tray, and error-message language (15 options). |
| Microphone | Input device, or system default. |
| Injection mode | type (synthetic keystrokes) or paste (clipboard). |
| Dictionary | Bias words so names/jargon transcribe correctly. |
| Replacements | from β to rewrites applied to every transcript. |
| Dictation sounds | Start/stop audio cues. |
| Mute music | Duck other audio while recording. |
| Show pill | Keep the floating pill on screen, or hide until dictating. |
| Show in Dock | Toggle Dock icon vs. menu-bar-only (macOS). |
| Launch at login | Start Wisper automatically. |
| Theme | Light or dark. |
Whisper's multilingual models transcribe all languages from a single file; the
.en variants are English-only but a little faster. Bigger = more accurate and
slower.
| Model | Size | Languages |
|---|---|---|
tiny / tiny.en |
~75 MB | all / English |
base / base.en |
~142 MB | all / English |
small / small.en |
~466 MB | all / English |
medium / medium.en |
~1.5 GB | all / English |
large-v3-turbo |
~1.6 GB | all (near-large accuracy, fast) |
Models are downloaded on demand from inside the app and stored locally; you can remove them anytime to reclaim space.
- Transcription: 99 languages supported by Whisper, plus automatic detection.
- Interface: English, PortuguΓͺs, EspaΓ±ol, FranΓ§ais, Deutsch, Italiano, Nederlands, Π ΡΡΡΠΊΠΈΠΉ, Polski, TΓΌrkΓ§e, ζ₯ζ¬θͺ, νκ΅μ΄, δΈζ, Ψ§ΩΨΉΨ±Ψ¨ΩΨ©, ΰ€Ήΰ€Ώΰ€¨ΰ₯ΰ€¦ΰ₯ β applied to the window UI, the tray menu, and error toasts.
ββββββββββββ hold/double-tap βββββββββββββββ PCM audio ββββββββββββββββ
β Hotkey β βββββββββββββββββββΆ β Recorder β ββββββββββββΆ β whisper.cpp β
β (global) β β (cpal) β β (on-device) β
ββββββββββββ βββββββββββββββ ββββββββ¬ββββββββ
β text
ββββββββββββββββ keystrokes β
focused app ββββββββββββββββββββββββ β Injector β ββββββββββββββ
β (type/paste) β
ββββββββββββββββ
A global shortcut (or single-modifier event tap) starts capture, audio is
recorded with cpal, transcribed locally by whisper.cpp, run through your
dictionary/replacements, and injected into the focused app as keystrokes or a
clipboard paste. The floating pill is a non-activating overlay so it never steals
focus from the app you're typing into.
- Audio is processed entirely on your device and is not stored after transcription.
- No account, no telemetry, no network calls for transcription.
- The only network activity is downloading models you ask for and checking for app updates from GitHub Releases.
- History and insights are kept locally and never leave your machine.
Prerequisites: Rust, Node.js, pnpm, and the Tauri system dependencies for your OS.
git clone https://github.com/99labdev/wisper.chat.git
cd wisper.chat
pnpm install
pnpm tauri dev # run in development
pnpm tauri build # produce a release bundle for your platform# Frontend
pnpm lint # ESLint
pnpm typecheck # tsc --noEmit
pnpm test # Vitest
pnpm format:check # Prettier
# Backend (Rust)
cd src-tauri
cargo test
cargo clippy -- -D warnings
cargo fmt --checkwisper/
βββ src/ # React + TypeScript settings UI
β βββ routes/ # Home, Settings, Insights, Dictionary, Snippets, Overlayβ¦
β βββ lib/ # api bindings, i18n, theme, hotkey helpers
βββ src-tauri/ # Rust backend
β βββ src/
β βββ audio.rs # microphone capture (cpal)
β βββ stt.rs # Whisper transcription (whisper.cpp)
β βββ inject.rs # text injection (keystrokes / paste)
β βββ modtap.rs # single-modifier hotkey (macOS event tap)
β βββ uitext.rs # localized tray + error strings
β βββ overlay.rs # pill geometry & hit-testing
β βββ lib.rs # app wiring, tray, shortcuts, windows
βββ site/ # marketing website (GitHub Pages)
βββ .github/workflows/ # CI + cross-platform release
- Tauri 2 β native shell, tray, global shortcuts, OTA updater
- React + TypeScript + Vite β settings UI
- Rust β audio capture, Whisper inference, text injection, hotkey handling
- whisper.cpp (via
whisper-rs) β on-device speech-to-text, Metal-accelerated on macOS - cpal β cross-platform audio capture
Wisper updates itself: it checks GitHub Releases and applies signed over-the-air updates in the background.
For maintainers, pushing a v* tag triggers the
release workflow, which builds and publishes
installers for macOS, Linux, and Windows plus the updater manifest:
git tag v1.0.0
git push origin v1.0.0It records but no text is inserted
Grant Accessibility permission (macOS: System Settings β Privacy & Security β Accessibility) so Wisper can type into other apps. After a macOS update the grant can reset β toggle Wisper off and on in that list. As a fallback, switch the injection mode to paste in Settings.
The microphone isn't capturing audio
Allow Microphone access when prompted (or in your OS privacy settings), and make sure the right input device is selected in Settings β Microphone. On macOS, a permission can need re-granting right after installing or updating.
Transcription is slow
Use a smaller model (base or small), or large-v3-turbo for a good speed/accuracy trade-off. On Apple Silicon, Metal acceleration is enabled automatically.
macOS won't open the app ("unidentified developer")
Right-click the app β Open, then confirm. Builds are ad-hoc signed; this is a one-time step.
Contributions are welcome! Please read CONTRIBUTING.md, keep changes focused, and run the checks above before opening a PR. Bug reports and feature ideas are great as issues.
MIT Β© Hudson Brendon
- whisper.cpp and OpenAI's Whisper for on-device speech recognition
- Tauri for the native cross-platform shell
- Everyone who tests Wisper and files issues
