Skip to content

99labdev/wisper.chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

123 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Wisper

Wisper

Open-source, local-first voice dictation for your desktop.
Hold a hotkey, speak, release β€” your words are transcribed on-device with Whisper and typed straight into whatever app you're using.

🌐 Website & downloads

Download Β· Quick start Β· Build from source Β· Troubleshooting Β· Contributing Β· License

CI Coverage Latest release License Built with Tauri Platforms Last commit Open issues Stars


Contents

What it is

Wisper is a free, open alternative to cloud dictation tools like Wispr Flow. Everything runs on your machine β€” no cloud, no account, no telemetry, no subscription. Your audio never leaves your computer; transcription happens entirely on-device via whisper.cpp, with Metal GPU acceleration on Apple Silicon.

It lives in your system tray as a small floating "pill" and stays out of the way until you press your hotkey.

Why Wisper

Wisper Typical cloud dictation
Where audio goes Stays on your device Uploaded to a server
Account required No Usually yes
Cost Free & open source (MIT) Subscription
Works offline Yes No
Telemetry None Common
Languages 99 (Whisper) Varies
Customizable Source is yours Closed

✨ Features

  • πŸŽ™οΈ Push-to-talk & hands-free β€” hold the hotkey to dictate while held, or double-tap to keep recording without holding; a later single press stops and inserts.
  • ⌨️ Any hotkey you want β€” bind a combo (Ctrl+Shift+Z) or a single modifier on its own (just Option, Ctrl, or Shift, push-to-talk style).
  • πŸ”’ 100% local β€” on-device Whisper inference; nothing is ever sent anywhere.
  • ⚑ GPU-accelerated β€” Metal on Apple Silicon for near-instant transcription.
  • 🌍 99 transcription languages β€” pick one or let Whisper auto-detect.
  • πŸ—£οΈ Localized interface β€” UI, tray menu, and error messages in 15 languages (English, PortuguΓͺs, EspaΓ±ol, FranΓ§ais, Deutsch, Italiano, Nederlands, Русский, Polski, TΓΌrkΓ§e, ζ—₯本θͺž, ν•œκ΅­μ–΄, δΈ­ζ–‡, Ψ§Ω„ΨΉΨ±Ψ¨ΩŠΨ©, ΰ€Ήΰ€Ώΰ€¨ΰ₯ΰ€¦ΰ₯€).
  • πŸ“¦ Built-in model manager β€” download, switch, and remove Whisper models from inside the app, with live progress and cancel.
  • πŸ“– Custom dictionary β€” teach it names and jargon so they're spelled right.
  • πŸ” Text replacements β€” auto-rewrite snippets in every transcript (e.g. omw β†’ on my way).
  • πŸ“Š Insights & history β€” a local, honest log of what you dictated with words-per-minute and daily stats. Nothing fabricated, nothing uploaded.
  • πŸ–±οΈ Two injection modes β€” synthetic keystrokes or clipboard paste.
  • πŸ”Š Dictation sounds & music ducking β€” optional start/stop cues; optionally mute playing music while you talk.
  • πŸͺŸ Tray-native β€” closing the window keeps it running; quit from the tray.
  • πŸš€ Launch at login and optional menu-bar-only (hide the Dock icon).
  • πŸŒ— Light & dark theme.
  • πŸ”„ Automatic updates β€” signed over-the-air updates via GitHub Releases.

πŸš€ Quick start

  1. Download and install for your OS.
  2. Launch it and complete the short onboarding.
  3. Grant Microphone (and on macOS, Accessibility) permission β€” see Permissions.
  4. Download a model (start with base) and pick your language.
  5. Anywhere you can type, hold your hotkey, speak, release β€” your words appear in the focused app.

πŸ“₯ Install

Grab the installer for your platform from the latest release:

Platform Asset
macOS (Apple Silicon / Intel) .dmg
Windows .msi / .exe
Linux .AppImage / .deb

Note

macOS builds are ad-hoc signed (no paid Developer ID yet). On first launch, right-click the app β†’ Open, or allow it under System Settings β†’ Privacy & Security. After that it opens normally and updates itself.

πŸ” Permissions

Wisper needs OS-level permissions to hear you and to type for you:

Permission Why Where
Microphone Capture your voice macOS/Win/Linux prompt on first record
Accessibility (macOS) Insert text into other apps and read the global hotkey System Settings β†’ Privacy & Security β†’ Accessibility

On macOS, after an app update the system can occasionally drop the Accessibility grant. Wisper detects this and re-prompts; if text stops inserting after an update, re-enable it under Accessibility (see Troubleshooting).

🧭 Usage

  1. Open Settings from the tray icon.
  2. Download a model β€” start with base for a good size/quality balance; use small or large-v3-turbo for higher accuracy, especially in non-English.
  3. Choose your language (or Detect automatically) and set your hotkey (a combo or a single modifier).
  4. Dictate:
    • Push-to-talk β€” hold the hotkey, speak, release.
    • Hands-free β€” double-tap the hotkey to start, single-press to stop.
  5. The floating pill shows recording state and a quick language switcher. Closing the main window keeps Wisper running in the tray.

βš™οΈ Configuration

Everything is in Settings, persisted to a local config.toml:

Setting What it does
Hotkey Click to capture any combo, or a lone modifier (Option/Ctrl/Shift).
Model Active Whisper model; manage downloads here.
Language Transcription language, or auto-detect.
Interface language UI, tray, and error-message language (15 options).
Microphone Input device, or system default.
Injection mode type (synthetic keystrokes) or paste (clipboard).
Dictionary Bias words so names/jargon transcribe correctly.
Replacements from β†’ to rewrites applied to every transcript.
Dictation sounds Start/stop audio cues.
Mute music Duck other audio while recording.
Show pill Keep the floating pill on screen, or hide until dictating.
Show in Dock Toggle Dock icon vs. menu-bar-only (macOS).
Launch at login Start Wisper automatically.
Theme Light or dark.

🧠 Models

Whisper's multilingual models transcribe all languages from a single file; the .en variants are English-only but a little faster. Bigger = more accurate and slower.

Model Size Languages
tiny / tiny.en ~75 MB all / English
base / base.en ~142 MB all / English
small / small.en ~466 MB all / English
medium / medium.en ~1.5 GB all / English
large-v3-turbo ~1.6 GB all (near-large accuracy, fast)

Models are downloaded on demand from inside the app and stored locally; you can remove them anytime to reclaim space.

🌍 Languages

  • Transcription: 99 languages supported by Whisper, plus automatic detection.
  • Interface: English, PortuguΓͺs, EspaΓ±ol, FranΓ§ais, Deutsch, Italiano, Nederlands, Русский, Polski, TΓΌrkΓ§e, ζ—₯本θͺž, ν•œκ΅­μ–΄, δΈ­ζ–‡, Ψ§Ω„ΨΉΨ±Ψ¨ΩŠΨ©, ΰ€Ήΰ€Ώΰ€¨ΰ₯ΰ€¦ΰ₯€ β€” applied to the window UI, the tray menu, and error toasts.

πŸ”§ How it works

   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   hold/double-tap   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   PCM audio   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  Hotkey  β”‚ ──────────────────▢ β”‚   Recorder  β”‚ ───────────▢ β”‚ whisper.cpp  β”‚
   β”‚ (global) β”‚                     β”‚   (cpal)    β”‚              β”‚  (on-device) β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                                                                        β”‚ text
                                          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   keystrokes β”‚
   focused app  ◀───────────────────────  β”‚  Injector    β”‚ β—€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                          β”‚ (type/paste) β”‚
                                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

A global shortcut (or single-modifier event tap) starts capture, audio is recorded with cpal, transcribed locally by whisper.cpp, run through your dictionary/replacements, and injected into the focused app as keystrokes or a clipboard paste. The floating pill is a non-activating overlay so it never steals focus from the app you're typing into.

πŸ›‘οΈ Privacy

  • Audio is processed entirely on your device and is not stored after transcription.
  • No account, no telemetry, no network calls for transcription.
  • The only network activity is downloading models you ask for and checking for app updates from GitHub Releases.
  • History and insights are kept locally and never leave your machine.

πŸ—οΈ Build from source

Prerequisites: Rust, Node.js, pnpm, and the Tauri system dependencies for your OS.

git clone https://github.com/99labdev/wisper.chat.git
cd wisper.chat
pnpm install
pnpm tauri dev      # run in development
pnpm tauri build    # produce a release bundle for your platform

Checks

# Frontend
pnpm lint           # ESLint
pnpm typecheck      # tsc --noEmit
pnpm test           # Vitest
pnpm format:check   # Prettier

# Backend (Rust)
cd src-tauri
cargo test
cargo clippy -- -D warnings
cargo fmt --check

πŸ“ Project layout

wisper/
β”œβ”€β”€ src/                 # React + TypeScript settings UI
β”‚   β”œβ”€β”€ routes/          # Home, Settings, Insights, Dictionary, Snippets, Overlay…
β”‚   └── lib/             # api bindings, i18n, theme, hotkey helpers
β”œβ”€β”€ src-tauri/           # Rust backend
β”‚   └── src/
β”‚       β”œβ”€β”€ audio.rs     # microphone capture (cpal)
β”‚       β”œβ”€β”€ stt.rs       # Whisper transcription (whisper.cpp)
β”‚       β”œβ”€β”€ inject.rs    # text injection (keystrokes / paste)
β”‚       β”œβ”€β”€ modtap.rs    # single-modifier hotkey (macOS event tap)
β”‚       β”œβ”€β”€ uitext.rs    # localized tray + error strings
β”‚       β”œβ”€β”€ overlay.rs   # pill geometry & hit-testing
β”‚       └── lib.rs       # app wiring, tray, shortcuts, windows
β”œβ”€β”€ site/                # marketing website (GitHub Pages)
└── .github/workflows/   # CI + cross-platform release

🧩 Tech stack

  • Tauri 2 β€” native shell, tray, global shortcuts, OTA updater
  • React + TypeScript + Vite β€” settings UI
  • Rust β€” audio capture, Whisper inference, text injection, hotkey handling
  • whisper.cpp (via whisper-rs) β€” on-device speech-to-text, Metal-accelerated on macOS
  • cpal β€” cross-platform audio capture

πŸ”„ Updates & releases

Wisper updates itself: it checks GitHub Releases and applies signed over-the-air updates in the background.

For maintainers, pushing a v* tag triggers the release workflow, which builds and publishes installers for macOS, Linux, and Windows plus the updater manifest:

git tag v1.0.0
git push origin v1.0.0

🩺 Troubleshooting

It records but no text is inserted

Grant Accessibility permission (macOS: System Settings β†’ Privacy & Security β†’ Accessibility) so Wisper can type into other apps. After a macOS update the grant can reset β€” toggle Wisper off and on in that list. As a fallback, switch the injection mode to paste in Settings.

The microphone isn't capturing audio

Allow Microphone access when prompted (or in your OS privacy settings), and make sure the right input device is selected in Settings β†’ Microphone. On macOS, a permission can need re-granting right after installing or updating.

Transcription is slow

Use a smaller model (base or small), or large-v3-turbo for a good speed/accuracy trade-off. On Apple Silicon, Metal acceleration is enabled automatically.

macOS won't open the app ("unidentified developer")

Right-click the app β†’ Open, then confirm. Builds are ad-hoc signed; this is a one-time step.

🀝 Contributing

Contributions are welcome! Please read CONTRIBUTING.md, keep changes focused, and run the checks above before opening a PR. Bug reports and feature ideas are great as issues.

πŸ“„ License

MIT Β© Hudson Brendon

πŸ™ Acknowledgements

  • whisper.cpp and OpenAI's Whisper for on-device speech recognition
  • Tauri for the native cross-platform shell
  • Everyone who tests Wisper and files issues

About

Open-source, local-first voice dictation with on-device Whisper. An open alternative to Wispr Flow.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages