feat(livekit): mix loading-indicator into the single published audio track by tigranbs · Pull Request #19 · SaynaAI/sayna

tigranbs · 2026-05-30T00:28:20Z

Summary

Publishes the loading-indicator audio as part of the single agent audio track, mixed in server-side, instead of on a separate loading-audio track.

LiveKit is an SFU — it forwards each published track independently and never mixes audio server-side. Many subscribers only ever play one audio track: custom browser clients that attach a single <audio> element, and SIP bridges that down-mix to one phone stream. A dedicated second track therefore never reached them. This change mixes the loading clip into the one published tts-audio track so every single-track subscriber hears it.

How it works

A single audio "pump" task is the sole writer to the published NativeAudioSource:

Idle / speaking: TTS PCM is captured pass-through — byte-identical output to before.
Indicator active: the pump emits continuous 10 ms frames from the looping clip and sums in any buffered TTS (saturating i16 add — the same approach as LiveKit's own AudioMixer), with a short fade-out on stop.

Both modes live in one task, so there is never a second writer racing the source (two writers on one NativeAudioSource interleave frames rather than mixing).

Fixes included

No truncated speech. The TTS buffer now back-pressures the producer instead of dropping the oldest samples, so rapid multi-sentence speak output plays in full (previously the middle could be cut off).
No crash on non-standard sample rates. The source queue depth is computed as a valid non-zero multiple of 10 ms for any sample rate (e.g. 44.1 kHz no longer trips the libwebrtc assertion).
Client-supplied loading clips of any supported rate/channel layout are resampled once at load time (adds rubato) to the published track format.
Reconnect / disconnect / teardown simplified to the single source + pump.

Compatibility

Replaces the dedicated loading-audio track; an audio session now publishes exactly one audio track (tts-audio) whether or not a loading clip is configured.
WebSocket API unchanged: the loading_audio config and loading_start / loading_stop commands behave the same; clients that don't use them are unaffected.

Testing

cargo fmt and cargo clippy --all-targets --all-features -- -D warnings clean.
Full unit/integration suite green, including new resampler and mixer unit tests; the libwebrtc-native tests pass via the dedicated isolated CI step.
Docs (docs/websocket.md, docs/api-reference.md, docs/livekit_integration.md) updated to the single-track model.

…dio track Replace the dedicated "loading-audio" track with one published "tts-audio" track that mixes the loading-indicator clip into the TTS stream server-side, so single-track subscribers (browser clients, SIP bridges) hear it. LiveKit is an SFU and never mixes tracks server-side, and many clients play only one audio track, so a second track never reached them. A single audio pump task is the sole writer to the source: TTS pass-through when idle, and continuous 10ms mixing (saturating i16 sum) of the looping clip under speech when active, with a short fade-out on stop. - Resample client-supplied loading clips to the track format once at load time (rubato). - Back-pressure the TTS producer instead of dropping buffered audio, so rapid multi-sentence speech is no longer truncated. - Compute the source queue depth as a valid non-zero multiple of 10ms for any sample rate (fixes a crash at 44.1 kHz). - Collapse the client to one source/track/pump; simplify reconnect/teardown. - Update tests and docs to the single-track model.

tigranbs force-pushed the loading-indicator branch from 8aecd4a to e8d6662 Compare May 30, 2026 00:48

tigranbs merged commit 2be27ae into master May 31, 2026
1 check passed

tigranbs deleted the loading-indicator branch May 31, 2026 02:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(livekit): mix loading-indicator into the single published audio track#19

feat(livekit): mix loading-indicator into the single published audio track#19
tigranbs merged 1 commit into
masterfrom
loading-indicator

tigranbs commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

tigranbs commented May 30, 2026

Summary

How it works

Fixes included

Compatibility

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant