Supertonic MNN is a high-performance, lightweight multilingual text-to-speech (TTS) library powered by MNN inference engine. It supports 30+ languages, 10 voice styles, and provides both CLI and Python API.
Demo video: https://www.bilibili.com/video/BV1VFqiBSER3
- Multilingual: 30+ languages (English, Korean, Japanese, French, German, Spanish, etc.)
- Fast Inference: RTF ~ 0.07 on CPU
- Multiple Model Versions: v2 (multilingual) and v3 (30+ languages)
- Precision Options: fp32, fp16, int8
- 10 Voice Styles: M1-M5 (male), F1-F5 (female)
| Version | Languages | HuggingFace |
|---|---|---|
| v3 | 30+ languages + language-agnostic | yunfengwang/supertonic-tts-mnn |
| v2 | en, ko, es, pt, fr | yunfengwang/supertonic-tts-mnn |
Original ONNX models: Supertone/supertonic-2, Supertone/supertonic-3
Requires Python 3.10 (MNN constraint).
# Install with uv (recommended)
uv sync
# Or install with pip
pip install supertonic-mnnTo run the Gradio demo:
uv sync --group demo
uv run python app.py# English (default)
echo "Hello world" | uv run supertonic-mnn -o hello.wav
# Korean
echo "안녕하세요" | uv run supertonic-mnn --lang ko -o hello_ko.wav
# Japanese with v3 model, female voice
echo "こんにちは" | uv run supertonic-mnn --lang ja --version v3 --voice F1 -o hello_ja.wavfrom supertonic_mnn import SupertonicTTS
tts = SupertonicTTS(version="v3")
# English
audio, sr = tts.synthesize("Hello world", lang="en", voice="M1", output_file="hello.wav")
# Korean
audio, sr = tts.synthesize("안녕하세요", lang="ko", voice="F1", output_file="hello_ko.wav")
# Japanese
audio, sr = tts.synthesize("こんにちは", lang="ja", voice="M2", output_file="hello_ja.wav")en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi, na (language-agnostic)
Full documentation: Supertonic MNN Docs
This project is based on the original Supertonic by Supertone Inc.
Supertonic MNN 是一个基于 MNN 推理引擎的高性能、轻量级多语言文本转语音 (TTS) 库。支持 30+ 语言、10 种音色,同时提供命令行和 Python API。
Demo video: https://www.bilibili.com/video/BV1VFqiBSER3
- 多语言支持: 30+ 语言(英语、韩语、日语、法语、德语、西班牙语等)
- 极速推理: CPU 上 RTF 约 0.07
- 多模型版本: v2(多语言)和 v3(30+ 语言)
- 多精度: fp32, fp16, int8
- 10 种音色: M1-M5(男声),F1-F5(女声)
| 版本 | 语言 | HuggingFace |
|---|---|---|
| v3 | 30+ 语言 + 语言无关模式 | yunfengwang/supertonic-tts-mnn |
| v2 | en, ko, es, pt, fr | yunfengwang/supertonic-tts-mnn |
原始 ONNX 模型: Supertone/supertonic-2, Supertone/supertonic-3
需要 Python 3.10(MNN 限制)。
# 使用 uv 安装(推荐)
uv sync
# 或使用 pip 安装
pip install supertonic-mnn运行 Gradio demo:
uv sync --group demo
uv run python app.py# 英语(默认)
echo "Hello world" | uv run supertonic-mnn -o hello.wav
# 韩语
echo "안녕하세요" | uv run supertonic-mnn --lang ko -o hello_ko.wav
# 日语,v3 模型,女声
echo "こんにちは" | uv run supertonic-mnn --lang ja --version v3 --voice F1 -o hello_ja.wavfrom supertonic_mnn import SupertonicTTS
tts = SupertonicTTS(version="v3")
# 英语
audio, sr = tts.synthesize("Hello world", lang="en", voice="M1", output_file="hello.wav")
# 韩语
audio, sr = tts.synthesize("안녕하세요", lang="ko", voice="F1", output_file="hello_ko.wav")
# 日语
audio, sr = tts.synthesize("こんにちは", lang="ja", voice="M2", output_file="hello_ja.wav")en, ko, ja, ar, bg, cs, da, de, el, es, et, fi, fr, hi, hr, hu, id, it, lt, lv, nl, pl, pt, ro, ru, sk, sl, sv, tr, uk, vi, na(语言无关)
完整文档: Supertonic MNN Docs
本项目基于 Supertone Inc. 的原始 Supertonic 工作。