Skip to content

feat(livekit): mix loading-indicator into the single published audio track#19

Merged
tigranbs merged 1 commit into
masterfrom
loading-indicator
May 31, 2026
Merged

feat(livekit): mix loading-indicator into the single published audio track#19
tigranbs merged 1 commit into
masterfrom
loading-indicator

Conversation

@tigranbs

Copy link
Copy Markdown
Contributor

Summary

Publishes the loading-indicator audio as part of the single agent audio track, mixed in server-side, instead of on a separate loading-audio track.

LiveKit is an SFU — it forwards each published track independently and never mixes audio server-side. Many subscribers only ever play one audio track: custom browser clients that attach a single <audio> element, and SIP bridges that down-mix to one phone stream. A dedicated second track therefore never reached them. This change mixes the loading clip into the one published tts-audio track so every single-track subscriber hears it.

How it works

A single audio "pump" task is the sole writer to the published NativeAudioSource:

  • Idle / speaking: TTS PCM is captured pass-through — byte-identical output to before.
  • Indicator active: the pump emits continuous 10 ms frames from the looping clip and sums in any buffered TTS (saturating i16 add — the same approach as LiveKit's own AudioMixer), with a short fade-out on stop.

Both modes live in one task, so there is never a second writer racing the source (two writers on one NativeAudioSource interleave frames rather than mixing).

Fixes included

  • No truncated speech. The TTS buffer now back-pressures the producer instead of dropping the oldest samples, so rapid multi-sentence speak output plays in full (previously the middle could be cut off).
  • No crash on non-standard sample rates. The source queue depth is computed as a valid non-zero multiple of 10 ms for any sample rate (e.g. 44.1 kHz no longer trips the libwebrtc assertion).
  • Client-supplied loading clips of any supported rate/channel layout are resampled once at load time (adds rubato) to the published track format.
  • Reconnect / disconnect / teardown simplified to the single source + pump.

Compatibility

  • Replaces the dedicated loading-audio track; an audio session now publishes exactly one audio track (tts-audio) whether or not a loading clip is configured.
  • WebSocket API unchanged: the loading_audio config and loading_start / loading_stop commands behave the same; clients that don't use them are unaffected.

Testing

  • cargo fmt and cargo clippy --all-targets --all-features -- -D warnings clean.
  • Full unit/integration suite green, including new resampler and mixer unit tests; the libwebrtc-native tests pass via the dedicated isolated CI step.
  • Docs (docs/websocket.md, docs/api-reference.md, docs/livekit_integration.md) updated to the single-track model.

…dio track

Replace the dedicated "loading-audio" track with one published "tts-audio"
track that mixes the loading-indicator clip into the TTS stream server-side,
so single-track subscribers (browser clients, SIP bridges) hear it. LiveKit
is an SFU and never mixes tracks server-side, and many clients play only one
audio track, so a second track never reached them.

A single audio pump task is the sole writer to the source: TTS pass-through
when idle, and continuous 10ms mixing (saturating i16 sum) of the looping
clip under speech when active, with a short fade-out on stop.

- Resample client-supplied loading clips to the track format once at load
  time (rubato).
- Back-pressure the TTS producer instead of dropping buffered audio, so rapid
  multi-sentence speech is no longer truncated.
- Compute the source queue depth as a valid non-zero multiple of 10ms for any
  sample rate (fixes a crash at 44.1 kHz).
- Collapse the client to one source/track/pump; simplify reconnect/teardown.
- Update tests and docs to the single-track model.
@tigranbs tigranbs force-pushed the loading-indicator branch from 8aecd4a to e8d6662 Compare May 30, 2026 00:48
@tigranbs tigranbs merged commit 2be27ae into master May 31, 2026
1 check passed
@tigranbs tigranbs deleted the loading-indicator branch May 31, 2026 02:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant