PushToType

Hold a hotkey, speak, release — your words appear wherever your cursor is.

PushToType is a local, real-time speech-to-text tool for Linux. It transcribes your voice using a local Whisper model and types the result directly into whatever application has focus — no clipboard, no cloud, no API keys.

An open-source alternative to OpenAI's Whisper Flow, which has no Linux support.

Features

Works everywhere — types into any focused app: browsers, editors, terminals, search bars
Local-only — faster-whisper runs on your GPU (CUDA) with automatic CPU fallback
No cloud — no API keys, no network required after the one-time model download
Fast — ~250ms from hotkey release to text appearing
Configurable — TOML config file, interactive setup wizard, CLI flags
Wayland + X11 — works on both display servers via evdev

Quick Start

# Install
uv add pushtotype        # or: pip install pushtotype

# System dependencies (X11)
sudo apt install libportaudio2 xdotool

# Add yourself to the input group (required for hotkey detection)
sudo usermod -aG input $USER
# Log out and back in for this to take effect

# Run the setup wizard
pushtotype config

# Start
pushtotype

Hold your configured hotkey (default: right Ctrl), speak, release. Text appears at the cursor.

How It Works

[Hold hotkey] → [Record audio] → [Whisper transcription] → [Type into focused app]
     evdev            sounddevice       faster-whisper           xdotool type

PushToType runs as a background daemon. A global hotkey listener (via evdev, reading directly from /dev/input/) fires a recording callback. When you release the hotkey, the audio is sent to faster-whisper for transcription, then xdotool type injects the text into whatever window is focused.

Installation

Recommended: uv

uv tool install pushtotype

pip / pipx

pip install pushtotype
# or
pipx install pushtotype

From source

git clone https://github.com/danielgraviet/pushtotype.git
cd pushtotype
uv pip install -e ".[dev]"

System Requirements

Requirement	Notes
Linux	X11 or Wayland
Python 3.10+
`libportaudio2`	`sudo apt install libportaudio2`
`xdotool`	X11 only — `sudo apt install xdotool`
`wtype` + `wl-clipboard`	Wayland only — `sudo apt install wtype wl-clipboard`
`input` group	`sudo usermod -aG input $USER`
NVIDIA GPU	Recommended for speed — CPU works but is slower

Configuration

Config file lives at ~/.config/pushtotype/config.toml. Run pushtotype config to create it interactively.

[hotkey]
keys = ["KEY_RIGHTCTRL"]

[audio]
device = "default"
sample_rate = 16000

[model]
name = "base.en"
device = "auto"
compute_type = "float16"

[feedback]
enabled = true
volume = 0.5

[output]
method = "auto"   # "auto", "x11", or "wayland"

Config priority (highest to lowest)

CLI flags (e.g. --model small.en)
Environment variables (e.g. PUSHTOTYPE_MODEL=small.en)
Config file (~/.config/pushtotype/config.toml)
Built-in defaults

Environment variables

Variable	Config key
`PUSHTOTYPE_MODEL`	`model.name`
`PUSHTOTYPE_DEVICE`	`model.device`
`PUSHTOTYPE_AUDIO_DEV`	`audio.device`
`PUSHTOTYPE_FEEDBACK`	`feedback.enabled`
`PUSHTOTYPE_HOTKEY`	`hotkey.keys` (comma-separated)

CLI Reference

pushtotype                  Start the push-to-talk daemon
pushtotype config           Run the interactive setup wizard
pushtotype config --show    Print the current effective config
pushtotype devices          List available audio input devices
pushtotype test             Record 5 seconds and transcribe (verify setup)
pushtotype download [MODEL] Pre-download a Whisper model

Global flags:

-v, --verbose     Enable debug logging (shows per-step timings)
-q, --quiet       Suppress all output except errors
--log-file PATH   Write logs to a file
--model NAME      Override model (e.g. small.en)
--hotkey COMBO    Override hotkey (e.g. ctrl+shift+s)
--device INDEX    Override audio device index
--no-feedback     Disable start/stop beeps

Troubleshooting

Permission denied on /dev/input/

You need to be in the input group:

sudo usermod -aG input $USER
# Log out and back in

xdotool not found

sudo apt install xdotool

Text doesn't appear in my terminal

Terminals use Ctrl+Shift+V to paste, but PushToType uses xdotool type which bypasses the clipboard entirely — it should work in all terminals without any special config.

CUDA not available

PushToType automatically falls back to CPU. Transcription will be slower (~1-3s per 5s of audio vs ~0.2s on GPU). Check pushtotype -v startup output to see which device is being used.

Model download fails / slow

Models are cached in ~/.cache/huggingface/hub/ after the first download. Pre-download manually:

pushtotype download base.en

wtype or wl-copy not found (Wayland)

sudo apt install wtype wl-clipboard

Known Limitations

English only (base.en model)
No AMD GPU (ROCm) support
Wayland session detection relies on XDG_SESSION_TYPE or WAYLAND_DISPLAY
No GUI — terminal only

Contributing

See CONTRIBUTING.md. Issues and PRs welcome.

License

MIT

TestPyPI install (development)

uv pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ pushtotype

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github		.github
assets		assets
learnings		learnings
memory		memory
src/pushtotype		src/pushtotype
tasks		tasks
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
claude.md		claude.md
dogfood_test.md		dogfood_test.md
main.py		main.py
project-plan.md		project-plan.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PushToType

Features

Quick Start

How It Works

Installation

Recommended: uv

pip / pipx

From source

System Requirements

Configuration

Config priority (highest to lowest)

Environment variables

CLI Reference

Troubleshooting

Known Limitations

Contributing

License

TestPyPI install (development)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PushToType

Features

Quick Start

How It Works

Installation

Recommended: uv

pip / pipx

From source

System Requirements

Configuration

Config priority (highest to lowest)

Environment variables

CLI Reference

Troubleshooting

Known Limitations

Contributing

License

TestPyPI install (development)

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages