feat(ws): loading-indicator audio loop on a dedicated LiveKit track#18
Merged
Conversation
Add a loading-indicator audio capability to the WebSocket API: a short client-supplied clip looped into the LiveKit room while the calling application is busy, so the human participant hears "still working" instead of silence. - config: new optional `loading_audio` object (base64 WAV or raw PCM with format/sample_rate/channels/volume); decoded and validated once per session, non-fatal on failure. - new `loading_start` / `loading_stop` WebSocket commands, fire-and-forget. - publish a second, dedicated `loading-audio` LiveKit track, independent of the `tts-audio` speech track; the loading loop never touches the TTS operation queue, audio queue, or generation state. - seamless cursor-based looping with a short linear fade-out on stop. - lifecycle handling for connect, reconnect, and disconnect teardown. The loop is controlled exclusively by `loading_start` / `loading_stop`; `speak` and `clear` are unaffected. No new dependencies.
Follow-up hardening for the loading-indicator audio feature: - Report the original loading_audio decode failure again on a later loading_start, instead of a generic "not available" message; the reason is retained on the connection state. - Give start_loading_audio distinct errors for a missing clip versus a loading track that failed to publish. - Close the reconnect/loading-loop race: tear down the loop and clear the dead loading source together under the loading_loop lock, so a loading_start racing a reconnect can never bind to a stale source. - Allow WAV container overhead in the decoded-payload size guard so a maximum-duration WAV clip is not rejected for its header bytes. - Run the libwebrtc-native loading-audio tests in an isolated, single-threaded CI step; they intermittently segfault under the full unit-test binary's thread concurrency. - Deduplicate the shared loading-clip test helper and add a handler-level test for the missing-clip error path.
442e748 to
469d91d
Compare
Clear loading_loop when the playback task exits; fix stop cursor continuity and stale decode errors; use BadRequest when LiveKit is not connected; honor cancel during fade-out; add tests and docs; ignore local loading-indicator.md planning file.
7c717b8 to
6710d1e
Compare
Five sites constructed the same `AudioSourceOptions { echo_cancellation:
false, noise_suppression: false, auto_gain_control: false }` literal — two
TTS sites (`setup_audio_publishing`, `process_reconnect`), the loading-audio
track publisher, and two test-only injection sites. `auto_gain_control: false`
is load-bearing for the loading-audio `volume` feature (AGC would re-normalise
loudness and silently undo the configured attenuation), and that rationale was
only spelled out at one of the five sites.
Introduce `sayna_audio_source_options()` in `client/mod.rs` whose docstring
owns the rationale for all three flags. Replace every literal with a call to
the helper. The helper is re-exported `pub(crate)` under `#[cfg(test)]` so
the cross-module test in `handlers/ws/loading_handler.rs` can reach it
without exposing `client/` internals at runtime.
Also simplify two `loading_stop` no-op tests in `loading_handler.rs` whose
`(message_tx, mut message_rx) + drop(message_tx)` idiom only obscured the
intent — the handler takes no sender, so a discarded `_tx` is clearer.
The previously-drafted lower-level race tests against `run_loading_loop`
were dropped because the existing `livekit_native_rapid_start_stop_is_clean`,
`livekit_native_start_idempotent_and_disconnect_teardown`,
`livekit_native_loading_loop_cleared_after_stop`, and
`livekit_native_drop_cancels_active_loop` already exercise the same races
through the real `LiveKitClient` public API.
…ource test_handle_loading_start_message_success_is_silent constructed a real libwebrtc NativeAudioSource and spawned the loading-audio loop, but ran in the default multi-threaded cargo test pass. The project quarantines such tests behind a livekit_native_ prefix and an #[ignore] attribute because libwebrtc's lazily-initialised global runtime intermittently segfaults under thread concurrency (see src/livekit/client/tests.rs:499-506 and .github/workflows/ci.yml). This test was an outlier. Rename it with the livekit_native_ prefix and add the same #[ignore] attribute as its peers, so the dedicated single-threaded CI step picks it up alongside the existing six native tests.
tigranbs
added a commit
to SaynaAI/saysdk
that referenced
this pull request
May 25, 2026
* adding loading-indicator audio support Adds loading_audio config plus loading_start / loading_stop fire-and-forget WebSocket commands to node-sdk and python-sdk, mirroring the server addition in SaynaAI/sayna#18. Failures continue to surface through the existing registerOnError / register_on_error callbacks; no new error types are added. See ../sayna/docs/websocket.md#loading-indicator for the protocol contract. * removing misleading no-cover pragma from loading_start / loading_stop The wrapped-exception branches are exercised by test_loading_start_wraps_send_failure and test_loading_stop_wraps_send_failure in python-sdk/tests/test_client.py. The pragma was copy-pasted from sip_transfer (which has no equivalent test) and incorrectly masked covered code from coverage tracking.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a loading-indicator audio capability to the WebSocket API — a short, client-supplied clip that loops into the LiveKit room while the calling application is busy ("thinking"). It is the audio equivalent of a spinner: the human participant hears "still working" instead of ambiguous silence.
The loading audio plays on its own dedicated, second published LiveKit track (
loading-audio), fully independent of the speech track (tts-audio), so it never interferes with the STT/TTS pipeline.What changed
configmessage gains an optionalloading_audioobject: base64-encoded WAV or raw 16-bit PCM, with optionalformat,sample_rate,channels, andvolume(0.0–1.0). The clip is decoded and validated once per session; a decode failure is non-fatal (the session continues without the feature).loading_start— begins the seamless loop.loading_stop— stops the loop with a short linear fade-out.NativeAudioSource+LocalAudioTrackis published at the clip's own sample rate, so no resampling is needed. The loop captures frames directly on its own source and never touches the TTS operation queue, audio queue, or generation state.The loop is controlled exclusively by
loading_start/loading_stop.speakandclearare unaffected — applications sendloading_stopbeforespeakif they don't want overlap.Validation
hound,base64,tokio-utilwere already declared).Testing
cargo fmt --all -- --check,cargo clippy --all-targets --all-features -- -D warnings,cargo build --locked --all-features,cargo test --locked --all-features(105 tests passing).Docs
docs/websocket.md,docs/livekit_integration.md,docs/api-reference.md, and the generateddocs/openapi.yamlare updated;CLAUDE.mdnotes the new commands.