Voice input on the desktop is usually cloud-dependent, subscription-bound, or a chore to set up. Thoth runs speech-to-text entirely on your machine with GPU acceleration. Press a key in any app, speak, and the text lands at your cursor. Nothing leaves the machine. No subscription, no cloud, no internet required.
|
Local, private transcription
|
Press a key, speak, paste
|
|
Smart correction
|
Optional AI enhancement
|
|
History and export
|
Automation and MCP
|
After installing, Thoth checks for updates automatically and installs them in-app.
macOS will block the app the first time you open it because it isn't from the App Store. This is normal and only happens once.
- Open the
.dmgand drag Thoth to Applications - Right-click (or Control-click) the app and choose Open
- Click Open in the dialogue that appears
Alternative: remove the block from Terminal
xattr -dr com.apple.quarantine /Applications/Thoth.app- Download the
.AppImage(or.deb) from the latest release - Make it executable:
chmod +x Thoth_*.AppImage - Run it:
./Thoth_*.AppImage
For GPU-accelerated transcription, install
libvulkan1and your GPU's Vulkan driver; without them Thoth falls back to CPU. See the Troubleshooting guide for Wayland and permission notes.
Once it's running, the Getting Started guide walks you through downloading a model, granting permissions, and your first dictation.
| Layer | Choice | Why |
|---|---|---|
| Framework | Tauri 2.0 | Native performance, small binaries, cross-platform |
| Backend | Rust 2024 | Memory safety, audio performance |
| Frontend | Svelte 5 | Reactive UI with runes |
| Audio | cpal | Cross-platform audio capture |
| Transcription | whisper.cpp | GPU-accelerated; Apple Neural Engine and sherpa-onnx options |
| Database | SQLite | Local persistence with migrations |
| AI | Ollama | Local LLM enhancement (or any OpenAI-compatible endpoint) |
| Control API | axum | Loopback HTTP control surface for automation |
| MCP | rmcp | Bundled MCP server for LLM assistants |
| Guide | What it covers |
|---|---|
| Getting Started | First-run setup: download a model, grant permissions, your first dictation |
| Personal Dictionary | Custom vocabulary and smart name correction (the canonical registry) |
| AI Enhancement Prompts | Writing effective prompts for the optional Ollama post-processing |
| Automation and MCP | The control API and MCP server for driving Thoth from an LLM assistant |
| Troubleshooting | Hotkeys, permissions, paste, GPU, and Wayland gotchas |
| Product docs | Intent, workflows, and design principles |
pnpm install
pnpm tauri dev # Development build
pnpm tauri build # Production buildRequirements
- macOS 14.0+ or Linux
- Rust 1.87+ (2024 edition)
- Node.js 20+
- pnpm
Linux GPU acceleration
whisper.cpp supports GPU acceleration on Linux. Choose the backend that matches your hardware:
| GPU | Feature Flag | Requirements |
|---|---|---|
| NVIDIA | --features cuda |
CUDA Toolkit 12.x, NVIDIA drivers |
| AMD | --features hipblas |
ROCm 6.x |
| Any (vendor-neutral) | --features vulkan |
Vulkan drivers (what the release ships) |
pnpm tauri build -- --no-default-features --features vulkan # what the Linux release ships
pnpm tauri build -- --no-default-features --features cuda # NVIDIA
pnpm tauri build -- --no-default-features --features hipblas # AMDBuilding from source needs the Linux system libraries and the Vulkan toolchain; see docs/development/linux-setup.md for the full dependency list, runtime packages, and display-server notes. If GPU initialisation fails at runtime, Thoth falls back to CPU automatically.
Your voice. Your machine. Nothing else.
Named after the Egyptian god of writing and wisdom, the scribe who faithfully records all that is spoken.
Built on whisper.cpp, Tauri, cpal, and sherpa-onnx. Inspired by MacWhisper, VoiceInk, and Spokenly.

