Skip to content

smallgun01/AniCompanion

 
 

Repository files navigation

AniCompanion app icon

AniCompanion

A face for your AI agent.
A desktop VRM character that chats, speaks, lip-syncs, and emotes.

License  Platform  Swift  Status

小光 lives on your desktop as a 3D VRM avatar — she chats with you, speaks with a synthesized voice, lip-syncs and emotes, listens to your voice, and proactively strikes up conversations when you've been quiet.

AniCompanion doesn't ship an LLM; it's the character, voice, and presence layer in front of an agent you run. Any gateway that can stream chat completions can drive it — backends are pluggable (see Bring your own agent). Hermes Agent is the reference backend, validated end-to-end and runnable locally so your conversations stay on your machine.

English 繁體中文
AniCompanion running with an English interface — the 小光 VRM avatar beside a chat panel AniCompanion 的繁體中文介面 — 小光 VRM 虛擬角色與聊天面板

Status: functional, early-stage. Built and tested on macOS 15. Contributions welcome.

Features

  • 3D VRM character rendered with three-vrm (WebGL in a WKWebView) — spring-bone physics (hair/skirt), idle breathing/blink, skeletal gesture clips.
  • Streaming chat through a pluggable agent backend. Ships with Hermes Agent (the validated reference) and a generic OpenAI-compatible backend (Ollama, LM Studio, vLLM, OpenRouter, …); adding another is a one-case change — see CONTRIBUTING.md.
  • Text-to-speech via MiniMax Speech-02-Turbo, with amplitude-driven lip sync.
  • Speech-to-text voice input using Apple's on-device Speech framework (auto-stops on silence).
  • Emotions — 16 emotion tags from the LLM drive the avatar's facial expressions.
  • Proactive companion — greets you on launch and speaks up after a period of inactivity (tool-agnostic: uses your Hermes tools if configured, otherwise just chats).
  • Multilingual — ships in English and Traditional Chinese (繁體中文), switchable in Settings (both the interface and the language 小光 speaks). Adding a language is easy — see CONTRIBUTING.md.

Requirements

  • macOS 15.0+, Apple Silicon
  • Xcode 16 (Swift 6 toolchain)
  • XcodeGenbrew install xcodegen
  • A running agent gateway — a Hermes Agent gateway is the validated path (see Bring your own agent)
  • (Optional, for voice) a MiniMax account for TTS (API key + Group ID). Without it, disable TTS in Settings and 小光 replies with text + expressions only.

Quick start

# 1. Generate the Xcode project
xcodegen generate

# 2. Download the default VRM character model (not bundled — see ATTRIBUTION.md)
./scripts/download-model.sh

# 3. Build & run
open AniCompanion.xcodeproj      # then Run (⌘R) in Xcode
# …or from the command line:
xcodebuild -project AniCompanion.xcodeproj -scheme AniCompanion -destination 'platform=macOS' build

On first launch, open Settings (⚙️) and fill in:

  • Agent backend (Hermes by default), its Endpoint (default http://127.0.0.1:8642) and API Key — you'll need a gateway already running for chat to work (see Bring your own agent below)
  • (optional) MiniMax API Key + Group ID for voice

First launch needs internet — the three-vrm runtime loads from a CDN the first time, then caches. When it's working you'll see 小光 appear in the window and greet you; type in the box (or use the mic) and she replies. If the character never appears, check the Troubleshooting notes.

Bring your own agent

AniCompanion does not ship an LLM — it talks to an agent gateway you run yourself. Two backends are built in, selectable under Settings → Agent backend:

  • Hermes Agent — the reference backend, validated end-to-end (setup below).
  • OpenAI-compatible — point it at any gateway speaking /v1/chat/completions SSE: Ollama, LM Studio, vLLM, OpenRouter, and friends.

Adding another backend is a small, self-contained change — see Adding an agent backend in CONTRIBUTING.md.

Setting up the Hermes reference backend, briefly:

  1. Install Hermes Agent (see its docs) and configure a model provider (e.g. OpenRouter).
  2. In ~/.hermes/.env:
    API_SERVER_ENABLED=true
    API_SERVER_KEY=<a-random-key-you-choose>
    (generate one with openssl rand -hex 32). See examples/hermes.env.
  3. Start the gateway:
    hermes gateway          # → listening on http://127.0.0.1:8642
  4. Put the same endpoint + key into the app's Settings.

Full walkthrough, including optional MCP tools for richer proactive behavior, is in docs/hermes-setup.md.

The VRM model

The default character is Alicia Solid (ニコニ立体ちゃん), © DWANGO Co., Ltd. Its license does not permit redistribution, so it is not bundledscripts/download-model.sh fetches it for your own local use. See ATTRIBUTION.md and AniCompanion/Resources/VRMModel/LICENSE-AliciaSolid.md.

Using your own VRM

  1. Drop your .vrm file into AniCompanion/Resources/VRMModel/.
  2. Point the app at it — one line in AniCompanion/App/AppState.swift (initializeServices()):
    characterManager.loadModel(named: "YourModel.vrm")
    (Name the file AliciaSolid.vrm and you can skip even this step.)
  3. Rebuild. If the framing is off for your model's proportions, tune the camera live with the W/S/A/D/Q/E/R/F keys and set the result as the default in ThreeVRMRenderView.

What the model needs

three-vrm loads both VRM 0.x and 1.0, and every VRM is humanoid by spec — so posing, idle motion, and the skeletal gesture clips work with any valid model. The rest degrades gracefully:

Feature Requires If the model lacks it
Emotions (16 tags → expressions) Standard expression presets happy / angry / sad / relaxed Face stays neutral; everything else still works
Lip sync The aa mouth viseme (optional ARKit / jawOpen PerfectSync gives finer motion) Mouth doesn't move while speaking
Idle blink The blink expression preset No blinking
Hair / skirt physics Spring bones Hair and cloth stay static

In short: any humanoid VRM loads and animates; the standard expression presets plus the aa viseme are what unlock emotions and lip-sync. Richer models (more expressions, PerfectSync) can be given finer mappings in the three-vrm scene (Resources/ThreeVRM/vrm_scene.js).

How it works

You (text or voice) → streaming chat (agent gateway, SSE) → sentence parser → parallel TTS → ordered playback + lip sync

The streaming reply is split into sentences as tokens arrive; each sentence is synthesized to speech in parallel and queued for ordered playback, while audio amplitude drives the avatar's mouth. Emotion tags in the reply ([happy], [curious], …) switch the avatar's expression.

Architecture details and developer notes are in CLAUDE.md.

Troubleshooting

Symptom Likely cause / fix
xcodegen: command not found brew install xcodegen (see Requirements).
The window opens but the character never appears First launch needs internet (the three-vrm runtime loads from a CDN). Also confirm ./scripts/download-model.sh ran and a .vrm exists in AniCompanion/Resources/VRMModel/.
You type a message and nothing happens Your agent gateway isn't running / reachable. Start it (e.g. hermes gateway) and check the connection indicator in Settings. For Hermes, a 401 means the API Key in Settings doesn't match API_SERVER_KEY.
小光 replies in text but doesn't speak TTS is off or unconfigured — that's fine. For voice, add your MiniMax API Key + Group ID in Settings, or leave TTS disabled.
Voice input does nothing On first use macOS prompts for Microphone and Speech Recognition permission — allow both (System Settings → Privacy & Security).

More runtime diagnostics (health checks, connection states) are in docs/hermes-setup.md.

License

Application source code: MIT — see LICENSE.

Bundled/downloaded assets (the VRM model, animation clips) are third-party works under their own terms — see ATTRIBUTION.md. Notably the default VRM model is not MIT-licensed and is not redistributed by this project.

About

A desktop VRM character (小光) that gives your AI agent a face — chats, speaks, listens, lip-syncs, and emotes on macOS.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Swift 78.8%
  • Python 10.3%
  • JavaScript 9.0%
  • Other 1.9%