Voice-first, always-on AI desktop assistant. Claude does the thinking; Fish Audio (or ElevenLabs) does the talking; macOS does the work.
https://github.com/nikhilachale/GWEN/raw/main/public/demo.mp4
▶︎
public/demo.mp4— 100 s walkthrough: wake word, the audio-reactive orb, and live tool calls.
See
CLAUDE.md,agents/AGENTS.md, andagents/SKILLS.mdfor full architecture.
macOS:
brew install sox
brew install blueutil # optional, for Bluetooth toggleUbuntu:
sudo apt install sox libsqlite3-devWindows:
choco install sox.portablenpm installThis will run electron-rebuild automatically to compile better-sqlite3
against Electron's Node version.
cp .env.example .env
# fill in your API keysThe only hard requirement is ANTHROPIC_KEY — Gwen's brain. Every other
subsystem has a fallback, so you can run with just that one key:
- STT —
GROQ_KEY(preferred,whisper-large-v3-turbo) orOPENAI_KEY(whisper-1). With neither, Gwen transcribes locally via whisper.cpp (nodejs-whisper,base.en) — no key, fully offline, slower. - TTS —
FISH_KEY(+FISH_VOICE_ID, preferred) orELEVEN_KEY+ELEVEN_VOICE_ID. With neither, falls back to the built-in macOSsayvoice. - Google / Tavily / Porcupine — optional. Calendar reads from macOS Calendar.app without OAuth; Gwen degrades gracefully without the rest.
npm run setup-oauthThis opens a browser, you grant gmail.readonly, and the token is saved to
data/google-token.json. Skip this if you don't want email — calendar reads
straight from macOS Calendar.app, no OAuth required.
npm run devVite serves the renderer on localhost:5174, Electron picks it up.
- Calendar — read upcoming events from macOS Calendar.app (covers iCloud, Google, Exchange — whatever accounts you've added there)
- Email — check unread Gmail (read-only by design)
- Tasks — local task store (
add_task,get_tasks) - Notes — local markdown notes (
save_note,get_notes) - Reminders.app — iCloud-synced via AppleScript (
add_reminder,list_reminders) - Notes.app — iCloud-synced (
create_apple_note,search_apple_notes) - Day plan — combined morning briefing from calendar + tasks + memory
- Memory — persistent SQLite store for preferences and facts
- Apps — open any Mac app by name or alias (
open_app) - Files — list / open / reveal anything in Finder (
list_files,open_path) - Keystroke — type into the focused app (
type_text, requires Accessibility) - Messaging — send iMessage and WhatsApp (confirms before sending)
- System — volume, brightness, Wi-Fi, Bluetooth, dark mode, lock, sleep, battery
- Shortcuts bridge — run any macOS Shortcut by name; unlocks HomeKit, Focus modes, custom automations without writing more JS
- FaceTime — video or audio call
- Phone — placed via iPhone Continuity
- Maps — directions and place search
- Web search — Tavily
- Weather — current + forecast via wttr.in (no API key)
- Screen context — captures and reasons about what's on your screen
- Translation, definitions, math, conversions — answered directly by Claude
- Timers — countdown with macOS notification on fire
- Alarms — natural-language ("tomorrow 7am", "in 90 minutes")
build_software— spawns the Claude Code CLI to scaffold real projects
- Three.js orb (cyan → white → amber → green by state)
- Manual mic trigger (click the orb)
- Speech-to-text — Groq → OpenAI → local whisper.cpp fallback chain
- Claude tool-use loop with all tools
- Streaming TTS — Fish → ElevenLabs → macOS
sayfallback chain, audio-reactive orb - SQLite memory, JSON tasks, markdown notes
- Tavily search
- macOS Calendar.app (no setup — first run triggers a TCC prompt)
- Gmail (after
npm run setup-oauth) - Screen capture (asks for permission on first use, macOS)
- All macOS system + native-app tools listed above
- Wake word — Porcupine
.ppnfile atdata/wakewords/hey-gwen.ppn - Claude Code build pipeline — works if
claudeCLI is on$PATH - Bluetooth control —
brew install blueutil - Phone calls — paired iPhone with Calls on Other Devices enabled
- Calendar / Reminders / Notes / Music control — accept the macOS Automation prompts on first use (System Settings → Privacy & Security → Automation)
User Voice
│
▼
electron/main.ts ──── IPC ──── React UI (Orb + 3-column HUD)
│
├── core/listener.ts → STT chain: Groq → OpenAI → local whisper.cpp
├── core/brain.ts → Claude (orchestrator) → tools/* → returns text
└── core/speaker.ts → TTS chain: Fish → ElevenLabs → macOS `say`
(streamed audio level → orb)
Source is TypeScript, compiled to dist-electron/. core/listener.ts and
core/speaker.ts are thin shims; the real provider-chain logic lives in
src/skills/stt.ts and src/skills/tts.ts.
See agents/AGENTS.md for the full hub-and-spoke agent topology.
| Category | Tools |
|---|---|
| Calendar (macOS) / Email (Gmail) | get_calendar, get_emails |
| Tasks / Notes | add_task, get_tasks, save_note, get_notes |
| Memory | remember, recall |
| Day plan | get_day_plan |
| Web | search_web |
| Screen | get_screen_context |
| Apps & files | open_app, list_files, open_path, type_text |
| Messaging | send_imessage, send_whatsapp |
| System | set_volume, get_volume, set_brightness, toggle_wifi, toggle_bluetooth, toggle_dark_mode, lock_screen, sleep_mac, get_battery |
| Shortcuts | run_shortcut, list_shortcuts |
| Music | music_control, music_play, music_now_playing |
| Reminders.app | add_reminder, list_reminders |
| Notes.app | create_apple_note, search_apple_notes |
| Maps | get_directions, search_maps |
| Calls | facetime, call_phone |
| Time | set_timer, set_alarm, list_timers, cancel_timer |
| Weather | get_weather |
| Builder | build_software |
# Test the brain with a typed prompt
npm run test:brain "What's on my calendar today?"
# Test a single tool
npm run test:tool memory
npm run test:tool calendarMIT