speech-llms

Here are 6 public repositories matching this topic...

EmulationAI / awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

music-information-retrieval automatic-speech-recognition speech-to-text audio-processing music-ai music-processing large-language-models foundational-models speech-ai audio-ai large-audio-models speech-llms large-language-model-speech

Updated May 10, 2026

georgygospodinov / speech_course

Star

Deep Learning for Speech

deep-learning tts speech-recognition speaker-recognition keyword-spotting asr self-supervised-learning speech-llms

Updated Dec 21, 2025
Jupyter Notebook

ictnlp / FastLongSpeech

Star

FastLongSpeech is a novel framework designed to extend the capabilities of Large Speech-Language Models for efficient long-speech processing without necessitating dedicated long-speech training data.

speech speech-recognition speech-to-text multi-modal speech-processing spoken-language-understanding speech-emotion-recognition large-language-models llms llm-training qwen speech-llms large-speech-models multi-modal-llms qwen2-5 spoken-dialogue-models

Updated Jul 22, 2025
Python

ictnlp / LSG

Star

The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”

machine-translation speech transformers speech-to-text asr simultaneous-translation large-language-models simultaneous-machine-translation text-to-text text-to-text-generation speech-llms streaming-generation qwen-audio

Updated Jan 3, 2025
Python

Telefonica-Scientific-Research / flower_speech_llm

Star

Federated learning for Speech LLMs (WavLM+TinyLlama & Voxtral) with Flower and PyTorch Lightning. LoRA fine-tuning across MLS clients with FedAvg/FedProx, IID and speaker-based partitioning, per-round LR decay, W&B logging, and multi-GPU Ray simulation. Only adapter weights are shared — raw audio never leaves the client

speech federated-learning end-to-end-speech-recognition llms speech-llms

Updated May 24, 2026
Python

pro6692abou / llm-audio

Star

Provide Whisper-based audio transcription and translation with lightweight C++ libraries for easy integration into LLM projects.

text music-information-retrieval neural-networks speech-to-text text-to-image music-ai large-language-models foundational-models speech-ai vision-language-model audio-language large-vision-language-models large-audio-models speech-llms audio-understanding

Updated May 26, 2026
C++

Improve this page

Add a description, image, and links to the speech-llms topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speech-llms topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speech-llms

Here are 6 public repositories matching this topic...

EmulationAI / awesome-large-audio-models

georgygospodinov / speech_course

ictnlp / FastLongSpeech

ictnlp / LSG

Telefonica-Scientific-Research / flower_speech_llm

pro6692abou / llm-audio

Improve this page

Add this topic to your repo