Skip to content

Releases: missuo/koe

v1.0.16

11 May 09:46
594a400

Choose a tag to compare

v1.0.15

11 May 08:39
ff8dc16

Choose a tag to compare

What's Changed

  • feat(app): add template workflow and configurable overlay controls by @foru17 in #67
  • fix(app): hide MLX LLM provider on builds without mlx feature by @erning in #68
  • fix(app): hide Apple Speech provider on x86_64 builds by @erning in #69
  • feat(llm): implement multi-profile system for local and remote endpoints by @thedavidweng in #71
  • fix(setup): align MLX provider rows and add contributor avatars by @thedavidweng in #74
  • build(config): move macOS deployment target config to root by @thedavidweng in #73
  • feat(hotkey): add configurable LLM invert modifier for recording by @thedavidweng in #72
  • feat(overlay): text diff animation, auto-dismiss, and settings fixes by @foru17 in #76
  • fix(logging): restore RUST_LOG env var support in init_logging by @thedavidweng in #78
  • fix(audio): guard against invalid inputNode format after fresh mic permission grant by @thedavidweng in #81
  • feat(permissions): add NSAlert-based permission prompts with throttling by @thedavidweng in #80
  • fix(audio): prevent main-thread hang after Bluetooth device route change by @thedavidweng in #82
  • feat(llm): make chat completions path configurable by @imocat in #83
  • feat(setup): add LLM timeout + remote model picker by @imocat in #84
  • fix(asr): avoid duplicating full final transcripts by @fuscoyu in #85
  • Fix audio capture stability and hotkey session handling by @Mctashuo in #95
  • feat(asr): add new Doubao API params and settings UI by @Mctashuo in #96
  • Add GitHub Actions release workflow for macOS arm64, arm64 lite, and x86_64 builds. add tag to triger release. by @Mctashuo in #97

New Contributors

Full Changelog: v1.0.14...v1.0.15

v1.0.14

08 Apr 08:59
9e2639e

Choose a tag to compare

What's New

Local MLX LLM Provider for Fully Offline Voice Input

Koe now supports running LLM text correction locally via Apple MLX, enabling fully offline voice input without any cloud API. Thanks to @erning! (#66)

DoubaoIME Free ASR Provider

New ASR provider using Doubao IME with automatic device registration — no API key required, no cloud console setup needed. Doubao IME simulates an Android device registration to obtain credentials automatically, making it a truly zero-cost streaming speech recognition option out of the box.

The underlying koe-asr crate is published as a standalone Rust library. You can use it in any Rust project to add free streaming ASR with just a few lines of code:

[dependencies]
koe-asr = { git = "https://github.com/missuo/koe.git" }
use koe_asr::{AsrConfig, AsrEvent, AsrProvider, DoubaoImeProvider};

let mut asr = DoubaoImeProvider::new();
asr.connect(&AsrConfig::default()).await?;
asr.send_audio(&pcm_data).await?;
asr.finish_input().await?;
// Receive Interim / Definite / Final events...

See the koe-asr README for full documentation and examples for all 6 providers.

CLI: Audio/Video File Transcription

koe-cli now includes a transcribe command that converts any audio or video file to text using DoubaoIME free ASR. It uses ffmpeg to decode the input file and streams PCM data to the ASR provider. The CLI is now bundled inside Koe.app and installed as koe automatically via Homebrew.

# Transcribe an audio/video file
koe transcribe recording.mp3

# Show interim results as they arrive
koe transcribe video.mp4 -i

Requires ffmpeg installed on the system.

Koe-Lite Build Variant

Added a lightweight build variant for minimal distribution. Thanks to @erning! (#64)

Bug Fixes

  • Fix random keypresses when quitting after long runtime.
  • Fix earlier segments being lost in DoubaoIME long speech recognition.

Improvements

  • Add koe-asr README with detailed usage examples and API reference.
  • Apply rustfmt formatting across the workspace.
  • Sync README and DESIGN docs with current app behavior.

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

This installs both Koe.app and the koe CLI command.

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials
  5. Optionally symlink the CLI: ln -s /Applications/Koe.app/Contents/MacOS/koe-cli /usr/local/bin/koe

Note: The app is not code-signed. On first launch, right-click → Open, or allow it in System Settings → Privacy & Security.

If macOS still blocks the app, run:

xattr -rd com.apple.quarantine /Applications/Koe.app

v1.0.13

06 Apr 03:04
928318c

Choose a tag to compare

What's New

Apple Speech Provider (macOS 26+)

Koe now supports Apple's on-device Speech framework as an ASR provider. Zero-config, zero-download speech recognition with system-managed language assets. Audio flows through a Swift ↔ Rust FFI bridge following the same pattern as MLX. Setup Wizard includes language picker and asset management UI. Thanks to @erning! (#40)

Custom HTTP Headers for ASR WebSocket

Users can now specify custom HTTP headers in config.yaml for Doubao and Qwen ASR providers. When headers is specified, it fully replaces default provider headers, enabling third-party WebSocket endpoints with different authentication schemes. (#61)

Configurable Reasoning Control for LLM

New no_reasoning_control config option with three modes: reasoning_effort (default, OpenAI o-series), thinking (GLM, Qwen, etc.), and none. This lets users disable reasoning on non-OpenAI models that ignore reasoning_effort and waste time on thinking tokens. Thanks to @erning! (#60)

LLM Test Connection Parity

The wizard's Test Connection button now uses the exact same Rust code path as runtime LLM correction — same prompts, dictionary, timeout, and temperature. Previously used a separate Obj-C implementation that could pass while runtime silently failed. Elapsed time is always reported, even on timeout. Thanks to @erning! (#60)

Bug Fixes

  • Fix repeated accessibility permission prompts; permission menu items are now clickable with direct actions. Thanks to @erning! (#47)
  • Fix clipboard not restoring to empty state after dictation. Thanks to @erning! (#48)
  • Fix state machine race between Rust and ObjC after text delivery, preventing UI flicker. Thanks to @erning! (#50)
  • Fix audio capture error handling: propagate failures, cancel session cleanly, protect error display. Thanks to @erning! (#49)
  • Fix hotkey race window between menu close and quit action. Thanks to @erning! (#62)
  • Fix nested directory handling in model file removal. Thanks to @erning! (#46)
  • Fix spurious empty Interim events in Qwen and Doubao providers. Thanks to @erning! (#55)
  • Fix MLX callback lock and generation check to prevent session crosstalk. Thanks to @erning! (#54)
  • Fix atomic config file writes to prevent corruption on crash. Thanks to @erning! (#53)
  • Fix FeedbackSection serde defaults to match config YAML template. Thanks to @erning! (#52)
  • Fix wizard ASR test URLs to read from config instead of hardcoding. Thanks to @erning! (#59)
  • Redact transcription text from INFO logs for privacy. Thanks to @erning! (#45)
  • Enable system proxy support for model downloads. (#43)
  • Make download cancellation responsive during network issues. (#41, #42)
  • Propagate ASR errors and fail session with partial text. (#33)
  • Match runtime LLM request in wizard test button. (#34)
  • Check config save return values and report failures. (#35)
  • Add generation counter to prevent stale MLX session operations. (#38)
  • Use Rust FFI for resolved hotkey display in status bar. (#37)
  • Surface real ASR errors instead of generic message. (#31)
  • Cache MLX model across sessions to avoid repeated loads. (#30)
  • Skip sha256 check for non-LFS files in manifest pull. (#29)
  • Isolate sessions to prevent old/new interference. (#28)
  • Prevent use-after-free in MLX callback on cancel. (#27)
  • Prevent spurious hotkey triggers after monitor stop on quit.
  • Add missing SystemConfiguration framework linkage.

Improvements

  • Unified model status API with sha256 cache and async UI. (#36)
  • Centralize workspace dependencies and metadata for consistent builds. Thanks to @erning! (#51)
  • Replace line-based YAML parser with serde_yaml.
  • Resolve all clippy errors and warnings. (#32)

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

Note: The app is not code-signed. On first launch, right-click → Open, or allow it in System Settings → Privacy & Security.

If macOS still blocks the app, run:

xattr -rd com.apple.quarantine /Applications/Koe.app

v1.0.12

29 Mar 09:47
8797846

Choose a tag to compare

What's New

Local ASR Support (MLX + sherpa-onnx)

Koe now supports fully offline, on-device speech recognition via MLX Whisper and sherpa-onnx. No cloud credentials required — models are downloaded on first use and stored locally. Supports a wide range of Whisper model sizes. Thanks to @erning for this feature! (#19)

LLM Connection Warmup

Koe now pre-warms the HTTP connection to your LLM provider at the start of each session, hiding TCP/TLS handshake latency behind ASR processing time. This reduces the delay before corrected text appears, especially noticeable with remote providers. Thanks to @hyspace! (#25, #26)

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

Note: The app is not code-signed. On first launch, right-click → Open, or allow it in System Settings → Privacy & Security.

If macOS still blocks the app, run:

xattr -rd com.apple.quarantine /Applications/Koe.app

v1.0.11

28 Mar 13:00
ba44d95

Choose a tag to compare

What's New

Custom Keycode Support for Hotkeys

You can now use any non-modifier key (e.g. F1–F20, Escape, CapsLock) as a trigger or cancel key by specifying its raw macOS keycode in config.yaml. The status bar menu and setup wizard display friendly key names for recognized keycodes. This gives power users full flexibility beyond the 7 built-in modifier key options.

hotkey:
  trigger_key: 96    # F5
  cancel_key: 97     # F6

Qwen ASR Provider Support

Added Qwen (Aliyun DashScope) as a second ASR provider alongside Doubao. Configure it in config.yaml with your DashScope API key. The Qwen ASR URL and model are now fully configurable. Thanks to @nmvr2600! (#22)

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

v1.0.10

27 Mar 11:48
2a11920

Choose a tag to compare

What's New

Faster LLM Correction with HTTP/2 and Connection Reuse

The LLM HTTP client is now shared across voice sessions instead of being rebuilt every time. Combined with HTTP/2 support, this reduces average correction latency by ~20% in real-world benchmarks. The client automatically falls back to HTTP/1.1 when the upstream doesn't support HTTP/2. Thanks to @hyspace for this optimization! (#17)

GPT-5 Reasoning Effort Handling

When using GPT-5-style endpoints (max_completion_tokens), Koe now explicitly sets reasoning_effort: "none" to skip unnecessary reasoning on the latency-sensitive voice correction path. This avoids wasting time and tokens on a task that doesn't need chain-of-thought. Thanks to @hyspace! (#18)

Native Vibrancy Overlay Background

The live transcription overlay pill now uses NSVisualEffectView for its background, giving it a native macOS frosted-glass appearance that adapts to light and dark mode. Thanks to @erning! (#16)

Left and Right Control Key Support

Hotkey configuration now supports both left and right Control keys as trigger or cancel keys. Thanks to @nmvr2600! (#14)

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

v1.0.9

25 Mar 17:33
438aa98

Choose a tag to compare

What's New

Built-in App Updates

Koe now checks for app updates through a JSON update feed and includes a manual Check for Updates action in the menu bar.

Menu Bar Version Info

The menu bar status UI now shows the current version and build number directly, so it is easier to confirm which build is running.

Better Recording Recovery

Fixed microphone disconnect handling during recording. Koe now detects device loss more gracefully and recovers without leaving the app stuck in a broken session.

Provider-Based ASR Settings

ASR configuration has been migrated to a provider-based V2 format, and Settings now includes an ASR Provider selector. Doubao is the current built-in option and remains the default.

Improved Live Transcription Overlay

The interim transcription overlay now auto-wraps long text and avoids the layout jitter that made the text panel feel unstable during recognition.

Configurable Cancel Hotkey

Session cancel is now configurable as a dedicated hotkey separate from the trigger key. Existing configs are normalized automatically so missing or conflicting cancel keys are written back as a valid pair.

Setup Improvements

Refined default voice input settings so new installs come up with more sensible out-of-the-box behavior.

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe and let it appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

v1.0.8

25 Mar 09:05
079a9e8

Choose a tag to compare

What's New

Redesigned Settings UI

The setup wizard has been completely redesigned with a native macOS Settings-style toolbar (like System Settings / Tailscale). Each settings category (ASR, LLM, Hotkey, Dictionary, System Prompt) now has its own pane with SF Symbol icons, smooth animated switching, and auto-resizing.

Secure Key Fields

ASR Access Key and LLM API Key are now masked by default with a toggle eye icon to show/hide — no more credentials visible at a glance.

Token Parameter Configuration

Added a Token Parameter dropdown in LLM settings to choose between max_completion_tokens (GPT-5, o1/o3 reasoning models) and max_tokens (GPT-4o and older). Default changed to max_completion_tokens for modern model compatibility.

Configurable Max Token Parameter

Added max_token_parameter config option to support both legacy and modern OpenAI-compatible APIs. Thanks to @hyspace for this feature! (#7)

Bluetooth Microphone Recovery

Fixed Bluetooth mic reconnect failures (issue #8). Previously, the shared AVAudioEngine held stale device state after a Bluetooth disconnect/reconnect. Now a fresh engine is created per capture session.

Improved Defaults

  • Default LLM base URL: https://api.openai.com/v1
  • Default model: gpt-5.4-nano
  • Default token parameter: max_completion_tokens

Bug Fixes

  • Fixed setup wizard Cmd+C/V not working in text fields
  • Fixed LLM test error messages being truncated
  • Eliminated "Connecting..." overlay delay when ASR latency is high — recording starts immediately while audio buffers during connection

Install

Homebrew

brew tap owo-network/brew
brew install --cask koe

Manual Install

  1. Download Koe-macOS-arm64.zip from this release
  2. Unzip and drag Koe.app to /Applications
  3. Launch Koe — it will appear in your menu bar
  4. Open Settings from the menu bar icon to configure ASR and LLM credentials

v1.0.7

25 Mar 06:32
c6aa702

Choose a tag to compare

What's New

Real-Time ASR Text Overlay

The floating status pill now displays real-time speech recognition text as you speak, so you can see exactly what Koe is hearing. The text updates live during recording, showing the trailing portion when content exceeds the pill width, with a smooth left-edge fade gradient. Thanks to @erning for this feature! (#6)

Setup Wizard

Added a first-launch setup wizard that guides you through configuring ASR credentials, LLM settings, and hotkey preferences — no more editing YAML by hand to get started.

Hotkey Display

The menu bar dropdown now shows which hotkey is currently configured, making it easy to verify your trigger key at a glance.

Microphone Input Device Selection

You can now choose which microphone to use in the config file via audio.input_device, useful when you have multiple audio input devices. Thanks to @erning! (#1)

LLM Toggle

Added llm.enabled option to skip LLM correction entirely, pasting raw ASR text directly — useful for testing or when you just want fast transcription without AI post-processing.

ASR Crate Extraction

The ASR module has been extracted into a standalone koe-asr crate, improving code organization and making it easier to reuse or test independently.

Bug Fixes

  • Fixed overlay freezing when pausing speech for 1–2 seconds. The root cause was that once any utterance was marked as "definite" by the ASR server (after a silence gap), all subsequent responses were misclassified and stopped updating the overlay text.
  • Fixed various audio capture and hotkey detection edge cases.

Install

Homebrew

brew tap owo-network/brew
brew install owo-network/brew/koe

Direct Download

Download Koe-macOS-arm64.zip, unzip, and move Koe.app to /Applications.

Note: The app is not code-signed. On first launch, right-click -> Open, or allow it in System Settings -> Privacy & Security.

If macOS still blocks the app, run:

xattr -rd com.apple.quarantine /Applications/Koe.app