小光 lives on your desktop as a 3D VRM avatar — she chats with you, speaks with a synthesized voice, lip-syncs and emotes, listens to your voice, and proactively strikes up conversations when you've been quiet.
AniCompanion doesn't ship an LLM; it's the character, voice, and presence layer in front of an agent you run. Any gateway that can stream chat completions can drive it — backends are pluggable (see Bring your own agent). Hermes Agent is the reference backend, validated end-to-end and runnable locally so your conversations stay on your machine.
| English | 繁體中文 |
|---|---|
![]() |
![]() |
Status: functional, early-stage. Built and tested on macOS 15. Contributions welcome.
- 3D VRM character rendered with three-vrm (WebGL in a WKWebView) — spring-bone physics (hair/skirt), idle breathing/blink, skeletal gesture clips.
- Streaming chat through a pluggable agent backend. Ships with Hermes Agent (the validated
reference) and a generic OpenAI-compatible backend (Ollama, LM Studio, vLLM, OpenRouter, …);
adding another is a one-
casechange — seeCONTRIBUTING.md. - Text-to-speech via MiniMax Speech-02-Turbo, with amplitude-driven lip sync — plus an experimental local BlueMagpie-TTS option (pending verification).
- Speech-to-text voice input using Apple's on-device Speech framework (auto-stops on silence).
- Emotions — 16 emotion tags from the LLM drive the avatar's facial expressions.
- Proactive companion — greets you on launch and speaks up after a period of inactivity (tool-agnostic: uses your Hermes tools if configured, otherwise just chats).
- Desktop Pet mode — detach 小光 into a transparent, always-on-top overlay that lives on your desktop; drag to move, scroll/pinch to resize. See Desktop Pet mode.
- Multilingual — ships in English and Traditional Chinese (繁體中文), switchable in
Settings (both the interface and the language 小光 speaks). Adding a language is easy — see
CONTRIBUTING.md.
This release adds:
- 🐾 Desktop Pet mode — pop 小光 out of her window into a transparent, always-on-top desktop overlay you can drag and resize. See Desktop Pet mode.
- 🎙️ Pluggable text-to-speech — choose your voice provider in Settings → Voice: cloud MiniMax, plus an experimental local BlueMagpie-TTS option (pending verification — see Local BlueMagpie TTS). Contributed by @hlb.
- 🧍 Configurable character model — switch VRM models from Settings → Character instead of editing source. Contributed by @hlb. See Using your own VRM.
- Clearer language setting — the Language picker now notes that the interface language applies after an app restart (the character switches immediately).
- macOS 15.0+, Apple Silicon
- Xcode 16 (Swift 6 toolchain)
- XcodeGen —
brew install xcodegen - A running agent gateway — a Hermes Agent gateway is the validated path (see Bring your own agent)
- (Optional, for voice) a MiniMax account for cloud TTS (API key + Group ID). Without TTS, disable voice in Settings and 小光 replies with text + expressions only. (An experimental local BlueMagpie-TTS option also exists — see Local BlueMagpie TTS.)
# 1. Generate the Xcode project
xcodegen generate
# 2. Download the default VRM character model (not bundled — see ATTRIBUTION.md)
./scripts/download-model.sh
# 3. Build & run
open AniCompanion.xcodeproj # then Run (⌘R) in Xcode
# …or from the command line:
xcodebuild -project AniCompanion.xcodeproj -scheme AniCompanion -destination 'platform=macOS' buildOn first launch, open Settings (⚙️) and fill in:
- Agent backend (Hermes by default), its Endpoint (default
http://127.0.0.1:8642) and API Key — you'll need a gateway already running for chat to work (see Bring your own agent below) - (optional) Voice → TTS Provider:
- MiniMax — enter your API Key, Group ID, and Voice ID.
- BlueMagpie (experimental, pending verification) — point it at a local BlueMagpie-TTS server URL.
First launch needs internet — the three-vrm runtime loads from a CDN the first time, then caches. When it's working you'll see 小光 appear in the window and greet you; type in the box (or use the mic) and she replies. If the character never appears, check the Troubleshooting notes.
AniCompanion does not ship an LLM — it talks to an agent gateway you run yourself. Two backends are built in, selectable under Settings → Agent backend:
- Hermes Agent — the reference backend, validated end-to-end (setup below).
- OpenAI-compatible — point it at any gateway speaking
/v1/chat/completionsSSE: Ollama, LM Studio, vLLM, OpenRouter, and friends.
Adding another backend is a small, self-contained change — see
Adding an agent backend in CONTRIBUTING.md.
Setting up the Hermes reference backend, briefly:
- Install Hermes Agent (see its docs) and configure a model provider (e.g. OpenRouter).
- In
~/.hermes/.env:(generate one withAPI_SERVER_ENABLED=true API_SERVER_KEY=<a-random-key-you-choose>
openssl rand -hex 32). Seeexamples/hermes.env. - Start the gateway:
hermes gateway # → listening on http://127.0.0.1:8642 - Put the same endpoint + key into the app's Settings.
Full walkthrough, including optional MCP tools for richer proactive behavior, is in
docs/hermes-setup.md.
Alongside MiniMax, AniCompanion includes an optional BlueMagpie-TTS provider — select it under
Settings → Voice → TTS Provider → BlueMagpie to route speech to a local
BlueMagpie-TTS HTTP server (POST /v1/tts, WAV).
⚠️ Experimental — not verified end-to-end yet. The provider and a reference server (Tools/blue_magpie_tts_server.py) are wired up, but this path is still pending validation against BlueMagpie's next release. Until it's confirmed working, use MiniMax for voice — full setup steps will land here once BlueMagpie is verified.
Detach 小光 from her window and let her live on your desktop — a borderless, transparent, always-on-top companion that floats over your other apps. There's no chat panel in pet mode; instead a small speech bubble shows what she's saying (synced to her voice when TTS is on, paced for reading when it's off).
Enter pet mode — any of:
- the 🐾 button in the window's toolbar
- the Character ▸ Desktop Pet Mode menu
- the keyboard shortcut ⌘⇧D
While she's on the desktop:
| Action | How |
|---|---|
| Move her | Click and drag anywhere on her |
| Resize | Scroll up/down over her (mouse wheel or two-finger), or pinch on a trackpad — she keeps her proportions, and the size sticks across toggles |
| Return to the window | Double-click her — or press ⌘⇧D, or use the 🐾 button / Character menu again |
Your conversation is untouched while she's out: returning to the window brings the chat back exactly as you left it. She floats above other apps and follows you across Spaces, so she's there whatever you're working on.
The default character is Alicia Solid (ニコニ立体ちゃん), © DWANGO Co., Ltd. Its license does
not permit redistribution, so it is not bundled — scripts/download-model.sh fetches it for
your own local use. See ATTRIBUTION.md and
AniCompanion/Resources/VRMModel/LICENSE-AliciaSolid.md.
- Drop your
.vrmfile intoAniCompanion/Resources/VRMModel/. - Open Settings and set VRM Model Filename to your file name, for example
YourModel.vrm. (Name the fileAliciaSolid.vrmand you can skip this step.) - Rebuild. If the framing is off for your model's proportions, tune the camera live with the
W/S/A/D/Q/E/R/Fkeys and set the result as the default inThreeVRMRenderView.
three-vrm loads both VRM 0.x and 1.0, and every VRM is humanoid by spec — so posing, idle motion, and the skeletal gesture clips work with any valid model. The rest degrades gracefully:
| Feature | Requires | If the model lacks it |
|---|---|---|
| Emotions (16 tags → expressions) | Standard expression presets happy / angry / sad / relaxed | Face stays neutral; everything else still works |
| Lip sync | The aa mouth viseme (optional ARKit / jawOpen PerfectSync gives finer motion) |
Mouth doesn't move while speaking |
| Idle blink | The blink expression preset | No blinking |
| Hair / skirt physics | Spring bones | Hair and cloth stay static |
In short: any humanoid VRM loads and animates; the standard expression presets plus the aa
viseme are what unlock emotions and lip-sync. Richer models (more expressions, PerfectSync) can be
given finer mappings in the three-vrm scene (Resources/ThreeVRM/vrm_scene.js).
You (text or voice) → streaming chat (agent gateway, SSE) → sentence parser → parallel TTS → ordered playback + lip sync
The streaming reply is split into sentences as tokens arrive; each sentence is synthesized to
speech in parallel and queued for ordered playback, while audio amplitude drives the avatar's
mouth. Emotion tags in the reply ([happy], [curious], …) switch the avatar's expression.
Architecture details and developer notes are in CLAUDE.md.
| Symptom | Likely cause / fix |
|---|---|
xcodegen: command not found |
brew install xcodegen (see Requirements). |
| The window opens but the character never appears | First launch needs internet (the three-vrm runtime loads from a CDN). Also confirm ./scripts/download-model.sh ran and a .vrm exists in AniCompanion/Resources/VRMModel/. |
| You type a message and nothing happens | Your agent gateway isn't running / reachable. Start it (e.g. hermes gateway) and check the connection indicator in Settings. For Hermes, a 401 means the API Key in Settings doesn't match API_SERVER_KEY. |
| 小光 replies in text but doesn't speak | TTS is off or unconfigured — that's fine. For voice, add your MiniMax key + Group ID under Settings → Voice, or leave TTS disabled. (BlueMagpie TTS is experimental — see Local BlueMagpie TTS.) |
| Voice input does nothing | On first use macOS prompts for Microphone and Speech Recognition permission — allow both (System Settings → Privacy & Security). |
More runtime diagnostics (health checks, connection states) are in docs/hermes-setup.md.
Application source code: MIT — see LICENSE.
Bundled/downloaded assets (the VRM model, animation clips) are third-party works under their
own terms — see ATTRIBUTION.md. Notably the default VRM model is not
MIT-licensed and is not redistributed by this project.


