MMSpeech

Tools and frameworks for multimodal speech generation and dialogue

🗣️ Speech Synthesis

🧨 [2026]SwanBench-Speech

SwanBench-Speech is a comprehensive benchmark designed to evaluate the performance of long-form speech generation models. SwanBench-Speech has three key properties.:

Rich speech scenarios; 2)Comprehensive evaluation dimensions; 3) Valuable Insights

👏 [2025] DiTReducio

DiTReducio is a training-free acceleration framework that compresses computations in DiT-based TTS models through a progressive calibration process.

🔦 [2023] Make-An-Audio

Make-An-Audio is a prompt-enhanced diffusion model that addresses these gaps by 1) introducing pseudo prompt enhancement with a distill-then-reprogram approach; 2) leveraging spectrogram autoencoder to predict the self-supervised audio representation instead of waveforms.

Code: https://github.com/Text-to-Audio/Make-An-Audio

🔥 [2021] FastSpeech1&2

FastSpeech propose a novel feed-forward network based on Transformer to generate mel-spectrogram in parallel for TTS.

Code: https://github.com/ming024/FastSpeech2

👥 Spoken Dialogue

🎧 Spatial Audio

🔥 [2025] MRSAudio

MRSAudio is a large-scale multimodal spatial audio dataset designed to advance research in spatial audio understanding and generation. MRSAudio spans four distinct components: MRSLife, MRSSpeech, MRSMusic, and MRSSing, covering diverse real-world scenarios.

Code: https://github.com/MRSAudio/MRSAudio_Main

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MMSpeech

🗣️ Speech Synthesis

🧨 [2026]SwanBench-Speech

👏 [2025] DiTReducio

🔦 [2023] Make-An-Audio

🔥 [2021] FastSpeech1&2

👥 Spoken Dialogue

🎧 Spatial Audio

🔥 [2025] MRSAudio

Star History

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

MMSpeech

🗣️ Speech Synthesis

🧨 [2026]SwanBench-Speech

👏 [2025] DiTReducio

🔦 [2023] Make-An-Audio

🔥 [2021] FastSpeech1&2

👥 Spoken Dialogue

🎧 Spatial Audio

🔥 [2025] MRSAudio

Star History

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages