Sauti Unity Plugin

Native Unity voice-AI plugin. Fully offline. English. Privacy-first. Mic → Whisper → memory + RAG → Qwen3 GGUF → Kokoro → audio. One package. Zero cloud.

What it is

Sauti ("voice" in Swahili) lets a Unity game or VR experience hold a real spoken conversation with an AI character — entirely on the player's device, with no API keys, no cloud bill, and no audio ever leaving the headset.

🎤 Speech in. Whisper Small / Tiny ONNX, English, ~300 ms TTFA on desktop CPU.
🧠 Three-layer memory. Conversation history + temporary KV facts + RAG over a knowledge base you author yourself.
🤖 LLM brain. Qwen3-1.7B GGUF via llama.cpp on flagship; smaller variants on Quest.
🔊 Voice out. Kokoro 82M ONNX with 11 voices.
🎮 Drop-in for Unity 6+. Three UPM packages, one Editor menu, done.
🖱️ Two parallel APIs (v1.3+). Pure C# for programmers (new KokoroTtsRunner(...)), drag-and-drop SautiSpeaker/SautiKnowledgeBase/SautiAgent MonoBehaviours + Voice Profile/Knowledge Config/LLM Config ScriptableObjects for designers. Same runtime — choose either.

🎤 Mic  →  Whisper ONNX  →  text  →  Memory (history + RAG + temp KV)  →  Qwen3 GGUF  →  tokens  →  Kokoro ONNX  →  🔊 Audio
            STT                          Three-layer enriched prompt           LLM                           TTS

Two strictly-partitioned runtimes (ONNX Runtime + llama.cpp) — they share no memory and no GPU context, only C# strings. See memory/voice_ai_architecture.md for the full spec.

Quick install

You have two ways to consume Sauti.

A. Clone the full repo (recommended for first explore)

git clone https://github.com/SeedeXR/sauti-unity-plugin.git
cd sauti-unity-plugin
# Then: Unity Hub → Add project from disk → select this folder

B. Install as a UPM package (recommended for downstream projects)

One command — tools/setup-sauti.sh (macOS/Linux/WSL) — handles all three install steps + the model downloads:

# From a checked-out copy of this repo:
./tools/setup-sauti.sh --project-path /path/to/YourUnityProject

# Or, if you only have the script and want a fresh install:
curl -fsSL https://raw.githubusercontent.com/SeedeXR/sauti-unity-plugin/main/tools/setup-sauti.sh -o setup-sauti.sh
chmod +x setup-sauti.sh
./setup-sauti.sh --project-path /path/to/YourUnityProject

What it does, in order:

Writes the bootstrap to Packages/manifest.json — Sauti dep (via Git URL by default) + npmjs scoped registry + com.github.asus4.onnxruntime peer. Idempotent — re-runs are no-ops if the entries are already present.
Invokes Unity in batchmode to run Sauti.Editor.Setup.SautiSetupWizard.FixAllHeadless — adds the remaining peer deps (LLMUnity, whisper.unity, Collections, Mathematics) and writes the scripting-define symbols across Standalone/Android/iOS/WebGL.
Downloads the AI models from Hugging Face into <project>/Assets/StreamingAssets/VoiceAI/ with SHA-256 verification. Default profile (--models essential, ~1.4 GB): Kokoro 82M + 1 voice + MiniLM + Whisper Tiny + Qwen3-1.7B. --models all adds the other 10 voices + Whisper Small. --models none skips downloads.

Common variants:

# Local tarball install (no internet for Sauti's source — still needs HF for models):
./tools/setup-sauti.sh --project-path <proj> \
    --source tarball --tarball dist/com.sauti.voice-ai-1.3.2.tgz

# Just verify the model files match their SHA-256s, don't redownload anything:
./tools/setup-sauti.sh --project-path <proj> --no-bootstrap --no-wizard --verify

# Bootstrap only, defer the rest:
./tools/setup-sauti.sh --project-path <proj> --no-wizard --models none

Run ./tools/setup-sauti.sh --help for the full option list.

Or do it manually — three lines + one click

Step 1 — Paste this bootstrap into your project's Packages/manifest.json:

{
  "scopedRegistries": [
    {
      "name": "npmjs",
      "url": "https://registry.npmjs.com",
      "scopes": ["com.github.asus4"]
    }
  ],
  "dependencies": {
    "com.sauti.voice-ai":           "https://github.com/SeedeXR/sauti-unity-plugin.git?path=packaging/com.sauti.voice-ai",
    "com.github.asus4.onnxruntime": "0.4.7"
  }
}

That's the minimum bootstrap — Sauti itself + the one peer dep + the scoped registry Unity needs to find it.

Step 2 — Open the project in Unity. First import takes 1–3 min (Git clone + UPM resolution). The Sauti Setup Wizard auto-opens; if it doesn't, run Sauti → Verify Setup from the menu bar.

Step 3 — Click "Fix everything I can" in the wizard. It writes the remaining peer deps (LLMUnity, whisper.unity, Unity Collections, Unity Mathematics) into your manifest.json and sets the two scripting-define symbols (SAUTI_LLMUNITY_AVAILABLE, SAUTI_WHISPER_UNITY_AVAILABLE) across Standalone/Android/iOS/WebGL. Unity re-resolves packages once.

Step 4 — Download the ~1.6 GB of AI models (the only thing Sauti can't auto-fetch — Hugging Face license walls). Clone the source repo and copy Assets/StreamingAssets/VoiceAI/, or wait for the post-v1.3 model downloader.

Headless / CI install

Same logic as the GUI wizard, no dialogs:

unity -batchmode -quit -projectPath <project> \
  -executeMethod Sauti.Editor.Setup.SautiSetupWizard.FixAllHeadless

Alternative install methods

Tarball file: download com.sauti.voice-ai-<version>.tgz from Releases, put it under Packages/tarballs/, replace the Git URL in Step 1 with "file:tarballs/com.sauti.voice-ai-1.3.2.tgz".
Package Manager GUI: Window → Package Manager → ➕ → Install package from tarball → select the .tgz. You still need the scoped registry + ONNX line from Step 1.
Build the tarball yourself: tools/package-sauti.sh --skip-tests from a checked-out source repo → dist/com.sauti.voice-ai-<version>.tgz.

Quickstart (5 min)

# 1. Open project in Unity (auto-imports ~1.6 GiB of AI models from ai-models/)
# 2. Build the RAG knowledge base:
#    Menu: Sauti → Build Knowledge Base
# 3. Open one of the six experiment scenes:
#    experiments/01-tts-hello/HelloScene.unity  (smallest — just text-to-speech)
#    experiments/05-full-voice-loop/VoiceLoopScene.unity  (the integrated demo)
# 4. Press Play.

See the Quickstart guide for the full walkthrough.

What you get

For game designers

No-code path: drop in a JSON template, set a voice id, ship.

NPC dialogue — single character, configurable persona / voice / knowledge tag
Quest narrator — branching world narrator with chapter cues
Voice command routing — speech → game action mapping
VR companion — location-aware persistent companion (Quest)
Knowledge feed — bulk ingestion of game lore into the RAG database
Structured output — let the LLM trigger deterministic game mechanics

→ Designer guide

For Unity developers

Code-first path: composable subsystems with clean C# interfaces.

Sauti.Memory.TemporaryMemory — session-scoped KV facts
Sauti.Memory.SautiRag — injectable RAG retrieval wrapper
Sauti.Editor.Rag.KnowledgeBaseChunker — paragraph-boundary chunker
Sauti.Editor.Rag.MiniLmRagEmbedder — 384-dim sentence-transformer embedder
Sauti.Tts.KokoroTtsRunner — Kokoro 82M TTS with 11 built-in voices
Sauti.Editor.Rag.RagDatabaseBuilder — [MenuItem("Sauti/Build Knowledge Base")]

All subsystems are dependency-injectable, fence upstream packages behind preprocessor symbols, and have 33+ NUnit EditMode tests.

→ Developer guide

Six runnable experiments

Each is a Unity scene with a single MonoBehaviour orchestrator + a README explaining what it proves.

#	Experiment	Demonstrates
1	`01-tts-hello`	Type → Kokoro → audio
2	`02-stt-loopback`	Push-to-talk → Whisper → text
3	`03-llm-chat`	Text → Qwen3 → streamed tokens + sentence events
4	`04-rag-grounding`	A/B toggle proving RAG changes the LLM's answer
5	`05-full-voice-loop`	The integrated headline demo
6	`06-vr-quest-npc`	Spatialised VR NPC on Quest with controller trigger

→ Experiments overview

Privacy & offline-first

No internet connection required or used at runtime.
No telemetry, no analytics, no model downloads after install.
All four models live on disk in Assets/StreamingAssets/VoiceAI/ and load from there.
User audio and conversation history stay on the device. Per-session memory clears on app exit.
Android caveat: models copy from the compressed .jar to Application.persistentDataPath on first launch.

Platform support

Platform	STT	LLM	Embeddings	TTS
Windows / macOS / Linux	Whisper Small	Qwen3-1.7B Q5_K_M	MiniLM	Kokoro
iOS / Android (flagship)	Whisper Small	Qwen3-1.7B Q5_K_M	MiniLM	Kokoro
Meta Quest 2 / 3	Whisper Tiny	Qwen3-1.7B Q5_K_M*	MiniLM	Kokoro
Android (low-end)	Whisper Tiny	Qwen3-1.7B Q5_K_M*	MiniLM	Kokoro

* v1.2 Quest path uses Qwen3-1.7B (1.26 GB; tight on Quest 3's 8 GB RAM but functional). Gemma3-1B Q4_K_M was the original Quest pick but is deferred to a future release pending Gemma TOS acceptance. See per-platform notes.

Project status

Engineered + tested. All four pipeline stages compile cleanly in Unity 6.4. 38/38 EditMode tests pass. Real knowledge.db builds in 226 ms from the Frostmere sample knowledge base. Scene assembly + hardware validation on Quest are the remaining human-side tasks.

See SHIP_READINESS.md for the step-by-step go-live guide.

Surface	State
Compile	✓ 0 errors, 0 warnings
EditMode tests (Sauti)	✓ 50 / 50 pass — Unit 35, Integration 6, Regression 9
Upstream tests (whisper.unity, onnxruntime-unity)	✓ 3 / 3
Knowledge.db build	✓ End-to-end against real MiniLM weights
Six experiment scaffolds	✓ Code + READMEs + scene-creation guides
UPM tarball build (`tools/package-sauti.sh`)	✓ End-to-end, 88 KB tarball, SHA-256 emitted
GitHub Actions: docs + package	✓ Wired to `main` push + `v*` tag
Six `.unity` scene files	⏳ Manual creation (Editor GUI)
Quest hardware validation	⏳ Needs physical device

Documentation

Topic	Where
Canonical pipeline spec	memory/voice_ai_architecture.md
Ship readiness checklist	SHIP_READINESS.md
Full docs site (mkdocs)	https://SeedeXR.github.io/sauti-unity-plugin
Session log (audit trail)	memory/handover_session.md
Memory + agent files	memory/ (15 docs)
Per-experiment guides	experiments/*/README.md

Repository map

sauti-unity-plugin/
├── Assets/                              Unity asset tree (repo root is the Unity project)
│   ├── Sauti/Runtime/                   C# memory + TTS runner subsystems
│   ├── Sauti/Editor/                    MiniLM embedder + RAG menu builder
│   ├── Sauti/Tests/Editor/              50 NUnit EditMode tests (unit + integration + regression)
│   └── StreamingAssets/VoiceAI/         1.6 GiB of AI models (runtime location)
├── Packages/manifest.json               6 UPM dependencies (auto-fetched)
├── ProjectSettings/                     Unity project config
├── packaging/com.sauti.voice-ai/        UPM package source (Runtime/, Editor/, Tests/, Samples~/, Documentation~/)
├── tools/                               Build scripts (package-sauti.sh)
├── ai-models/                           Source-of-truth model checkout
├── docs/                                MkDocs source tree (this docs site)
├── experiments/                         Six runnable demos
├── knowledge-base/                      Plain-text source for the RAG database
├── memory/                              Append-only doc + session log
├── templates/                           JSON narrative templates
├── instructions/                        Engineering operations guide
├── .github/workflows/                   docs.yml + package.yml
├── mkdocs.yml                           Docs site config
├── README.md                            This file
└── SHIP_READINESS.md                    Step-by-step go-live guide

Architecture at a glance

┌──────────────────────────────────────────────────────────────────┐
│                       Sauti voice-AI pipeline                     │
│                                                                   │
│  ┌──────────┐  ┌─────────────────┐  ┌─────────┐  ┌────────────┐  │
│  │ Whisper  │→ │ Three-Layer     │→ │ Qwen3   │→ │ Kokoro     │  │
│  │ STT ONNX │  │ Memory:         │  │ GGUF    │  │ TTS ONNX   │  │
│  │          │  │ • L1 history    │  │         │  │            │  │
│  │          │  │ • L2 KV facts   │  │         │  │            │  │
│  │          │  │ • L3 RAG (MiniLM│  │         │  │            │  │
│  │          │  │   over knowledge│  │         │  │            │  │
│  │          │  │   .db)          │  │         │  │            │  │
│  └──────────┘  └─────────────────┘  └─────────┘  └────────────┘  │
│       │                                                  │        │
│       └────────────────  String only  ──────────────────┘        │
│                                                                   │
│  ┌───────────────────────────────┐ ┌─────────────────────────┐  │
│  │ ONNX Runtime                  │ │ llama.cpp (LLMUnity)    │  │
│  │ (asus4/onnxruntime-unity)     │ │ (undreamai/LLMUnity)    │  │
│  │ STT • Embeddings • TTS        │ │ LLM only                │  │
│  │ DirectML│CoreML│NNAPI│CUDA    │ │ Metal│Vulkan│NEON│CPU   │  │
│  └───────────────────────────────┘ └─────────────────────────┘  │
│  ── no shared memory · no shared GPU context · strings only ──   │
└──────────────────────────────────────────────────────────────────┘

Contributing

Sauti is built on a session-based workflow with append-only handover logs. See contributing and memory/handover_session.md for the audit trail.

License

Apache 2.0. See LICENSE (TBD — Apache-2.0 confirmed per memory/project_context.md § 1).

Each bundled AI model has its own license, recorded per-entry in ai-models/<stage>/manifest.json:

Model	License
Whisper Small / Tiny INT8	MIT
Qwen3-1.7B Q5_K_M	Apache-2.0
all-MiniLM-L6-v2 INT8	Apache-2.0
Kokoro 82M INT8 + voices	Apache-2.0

Credits

Whisper by OpenAI · ONNX export by onnx-community
Qwen3 by Alibaba · GGUF quant by unsloth
all-MiniLM-L6-v2 by sentence-transformers · INT8 by Xenova
Kokoro 82M · ONNX by onnx-community
whisper.unity by Macoron
LLMUnity by undreamai
onnxruntime-unity by asus4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sauti Unity Plugin

What it is

Quick install

A. Clone the full repo (recommended for first explore)

B. Install as a UPM package (recommended for downstream projects)

Or do it manually — three lines + one click

Headless / CI install

Alternative install methods

Quickstart (5 min)

What you get

For game designers

For Unity developers

Six runnable experiments

Privacy & offline-first

Platform support

Project status

Documentation

Repository map

Architecture at a glance

Contributing

License

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
.vscode		.vscode
Assets		Assets
Packages		Packages
ai-models		ai-models
docs		docs
experiments		experiments
instructions		instructions
knowledge-base		knowledge-base
memory		memory
packaging		packaging
templates		templates
tools		tools
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SHIP_READINESS.md		SHIP_READINESS.md
llms.txt		llms.txt
mkdocs.yml		mkdocs.yml
requirements-docs.txt		requirements-docs.txt
sauti-unity-plugin.slnx		sauti-unity-plugin.slnx

Folders and files

Latest commit

History

Repository files navigation

Sauti Unity Plugin

What it is

Quick install

A. Clone the full repo (recommended for first explore)

B. Install as a UPM package (recommended for downstream projects)

Or do it manually — three lines + one click

Headless / CI install

Alternative install methods

Quickstart (5 min)

What you get

For game designers

For Unity developers

Six runnable experiments

Privacy & offline-first

Platform support

Project status

Documentation

Repository map

Architecture at a glance

Contributing

License

Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages