feat(ws): loading-indicator audio loop on a dedicated LiveKit track by tigranbs · Pull Request #18 · SaynaAI/sayna

tigranbs · 2026-05-21T19:45:10Z

Summary

Adds a loading-indicator audio capability to the WebSocket API — a short, client-supplied clip that loops into the LiveKit room while the calling application is busy ("thinking"). It is the audio equivalent of a spinner: the human participant hears "still working" instead of ambiguous silence.

The loading audio plays on its own dedicated, second published LiveKit track (loading-audio), fully independent of the speech track (tts-audio), so it never interferes with the STT/TTS pipeline.

What changed

Config — the config message gains an optional loading_audio object: base64-encoded WAV or raw 16-bit PCM, with optional format, sample_rate, channels, and volume (0.0–1.0). The clip is decoded and validated once per session; a decode failure is non-fatal (the session continues without the feature).
Commands — two new fire-and-forget WebSocket commands:
- loading_start — begins the seamless loop.
- loading_stop — stops the loop with a short linear fade-out.
Dedicated track — a second NativeAudioSource + LocalAudioTrack is published at the clip's own sample rate, so no resampling is needed. The loop captures frames directly on its own source and never touches the TTS operation queue, audio queue, or generation state.
Playback — seamless cursor-based looping with wrap-around; ~30 ms fade-out on stop; volume applied once at decode time.
Lifecycle — track publish on connect, re-publish on reconnect, and full teardown (cancel → await → abort backstop) on disconnect.

The loop is controlled exclusively by loading_start / loading_stop. speak and clear are unaffected — applications send loading_stop before speak if they don't want overlap.

Validation

Input is validated: base64, size cap, 16-bit-PCM only, sample-rate range, channel count, duration bounds, and frame alignment — each with a clear client-facing error.
No new third-party dependencies (hound, base64, tokio-util were already declared).

Testing

Unit tests for decode/validation, loop mechanics (cursor wrap, fade-out), volume scaling, and message (de)serialization.
LiveKit client tests for track setup, start/stop idempotency, disconnect teardown, rapid start/stop stress, and drop-backstop cancellation.
All CI checks pass locally: cargo fmt --all -- --check, cargo clippy --all-targets --all-features -- -D warnings, cargo build --locked --all-features, cargo test --locked --all-features (105 tests passing).

Docs

docs/websocket.md, docs/livekit_integration.md, docs/api-reference.md, and the generated docs/openapi.yaml are updated; CLAUDE.md notes the new commands.

Add a loading-indicator audio capability to the WebSocket API: a short client-supplied clip looped into the LiveKit room while the calling application is busy, so the human participant hears "still working" instead of silence. - config: new optional `loading_audio` object (base64 WAV or raw PCM with format/sample_rate/channels/volume); decoded and validated once per session, non-fatal on failure. - new `loading_start` / `loading_stop` WebSocket commands, fire-and-forget. - publish a second, dedicated `loading-audio` LiveKit track, independent of the `tts-audio` speech track; the loading loop never touches the TTS operation queue, audio queue, or generation state. - seamless cursor-based looping with a short linear fade-out on stop. - lifecycle handling for connect, reconnect, and disconnect teardown. The loop is controlled exclusively by `loading_start` / `loading_stop`; `speak` and `clear` are unaffected. No new dependencies.

Follow-up hardening for the loading-indicator audio feature: - Report the original loading_audio decode failure again on a later loading_start, instead of a generic "not available" message; the reason is retained on the connection state. - Give start_loading_audio distinct errors for a missing clip versus a loading track that failed to publish. - Close the reconnect/loading-loop race: tear down the loop and clear the dead loading source together under the loading_loop lock, so a loading_start racing a reconnect can never bind to a stale source. - Allow WAV container overhead in the decoded-payload size guard so a maximum-duration WAV clip is not rejected for its header bytes. - Run the libwebrtc-native loading-audio tests in an isolated, single-threaded CI step; they intermittently segfault under the full unit-test binary's thread concurrency. - Deduplicate the shared loading-clip test helper and add a handler-level test for the missing-clip error path.

Clear loading_loop when the playback task exits; fix stop cursor continuity and stale decode errors; use BadRequest when LiveKit is not connected; honor cancel during fade-out; add tests and docs; ignore local loading-indicator.md planning file.

Five sites constructed the same `AudioSourceOptions { echo_cancellation: false, noise_suppression: false, auto_gain_control: false }` literal — two TTS sites (`setup_audio_publishing`, `process_reconnect`), the loading-audio track publisher, and two test-only injection sites. `auto_gain_control: false` is load-bearing for the loading-audio `volume` feature (AGC would re-normalise loudness and silently undo the configured attenuation), and that rationale was only spelled out at one of the five sites. Introduce `sayna_audio_source_options()` in `client/mod.rs` whose docstring owns the rationale for all three flags. Replace every literal with a call to the helper. The helper is re-exported `pub(crate)` under `#[cfg(test)]` so the cross-module test in `handlers/ws/loading_handler.rs` can reach it without exposing `client/` internals at runtime. Also simplify two `loading_stop` no-op tests in `loading_handler.rs` whose `(message_tx, mut message_rx) + drop(message_tx)` idiom only obscured the intent — the handler takes no sender, so a discarded `_tx` is clearer. The previously-drafted lower-level race tests against `run_loading_loop` were dropped because the existing `livekit_native_rapid_start_stop_is_clean`, `livekit_native_start_idempotent_and_disconnect_teardown`, `livekit_native_loading_loop_cleared_after_stop`, and `livekit_native_drop_cancels_active_loop` already exercise the same races through the real `LiveKitClient` public API.

…ource test_handle_loading_start_message_success_is_silent constructed a real libwebrtc NativeAudioSource and spawned the loading-audio loop, but ran in the default multi-threaded cargo test pass. The project quarantines such tests behind a livekit_native_ prefix and an #[ignore] attribute because libwebrtc's lazily-initialised global runtime intermittently segfaults under thread concurrency (see src/livekit/client/tests.rs:499-506 and .github/workflows/ci.yml). This test was an outlier. Rename it with the livekit_native_ prefix and add the same #[ignore] attribute as its peers, so the dedicated single-threaded CI step picks it up alongside the existing six native tests.

* adding loading-indicator audio support Adds loading_audio config plus loading_start / loading_stop fire-and-forget WebSocket commands to node-sdk and python-sdk, mirroring the server addition in SaynaAI/sayna#18. Failures continue to surface through the existing registerOnError / register_on_error callbacks; no new error types are added. See ../sayna/docs/websocket.md#loading-indicator for the protocol contract. * removing misleading no-cover pragma from loading_start / loading_stop The wrapped-exception branches are exercised by test_loading_start_wraps_send_failure and test_loading_stop_wraps_send_failure in python-sdk/tests/test_client.py. The pragma was copy-pasted from sip_transfer (which has no equivalent test) and incorrectly masked covered code from coverage tracking.

tigranbs added 2 commits May 21, 2026 12:44

tigranbs force-pushed the loading-indicator branch from 442e748 to 469d91d Compare May 22, 2026 06:33

tigranbs force-pushed the loading-indicator branch from 7c717b8 to 6710d1e Compare May 25, 2026 05:39

tigranbs added 2 commits May 24, 2026 23:14

tigranbs mentioned this pull request May 25, 2026

adding loading-indicator audio support SaynaAI/saysdk#2

Merged

tigranbs merged commit be537ea into master May 25, 2026
1 check passed

tigranbs deleted the loading-indicator branch May 25, 2026 22:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ws): loading-indicator audio loop on a dedicated LiveKit track#18

feat(ws): loading-indicator audio loop on a dedicated LiveKit track#18
tigranbs merged 5 commits into
masterfrom
loading-indicator

tigranbs commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

tigranbs commented May 21, 2026

Summary

What changed

Validation

Testing

Docs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant