Skip to content

RSDK-13979 Implement PlayStream#41

Merged
oliviamiller merged 13 commits into
viam-modules:mainfrom
oliviamiller:play-stream
Jun 8, 2026
Merged

RSDK-13979 Implement PlayStream#41
oliviamiller merged 13 commits into
viam-modules:mainfrom
oliviamiller:play-stream

Conversation

@oliviamiller

Copy link
Copy Markdown
Collaborator

Summary

Implements PlayStream for the Speaker component. Chunks pulled from the
client are decoded, channel-converted, resampled, and written into the playback
ring buffer as they arrive; the call blocks until the source signals
end-of-stream and the buffer drains.

  • Adds the play_stream(audio_info, chunk_source, extra) override.
  • Factors the shared decode/convert/resample/write pipeline out of play into
    a process_and_write_pcm helper, and the drain-wait loop into
    wait_for_playback (also moves the post-drain latency_ sleep inside,
    where stream_mu_ can guard the read).
  • Streams support PCM_16, PCM_32, PCM_32_FLOAT.

Will do MP3 support in a subsequent PR.

Blocked on SDK release

Requires viam-cpp-sdk v0.37.0 (PR viamrobotics/viam-cpp-sdk#637). Once cut,
bump conanfile.py from 0.21.00.37.0; no other changes needed.

Notes for reviewers

  • Producers must feed at ≤ real-time. The ring buffer is non-blocking on
    write, so overruns silently drop the oldest samples. Adding server-side
    backpressure is a possible follow-up.

Test plan

  • New PlayStream_* unit tests cover MP3 rejection, empty source, multi-
    chunk PCM_16, PCM_32, PCM_32_FLOAT, channel conversion, resampling, mid-
    stream stop, and stop-flag reset on entry.
  • Manual end-to-end against xarm: PCM16/22kHz-mono/PCM32/PCM32F all play,
    mid-stream stop + client context cancel exit cleanly, back-to-back
    PlayStream calls serialize via playback_mu_.
  • CI green after the SDK bump.

@oliviamiller oliviamiller requested a review from seanavery May 26, 2026 16:16
Comment thread src/speaker.cpp Outdated
@oliviamiller oliviamiller requested a review from seanavery May 27, 2026 18:40

@seanavery seanavery left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Producers must feed at ≤ real-time.

Trying to understand the significance of this. Seems like the use cases that motivate PlayStream TTS replies, file playback, anything cached in memory would produce faster than real-time.

Would it be easy to implement backpressure handling in process_and_write_pcm?

@oliviamiller

Copy link
Copy Markdown
Collaborator Author

Producers must feed at ≤ real-time.

Trying to understand the significance of this. Seems like the use cases that motivate PlayStream TTS replies, file playback, anything cached in memory would produce faster than real-time.

Would it be easy to implement backpressure handling in process_and_write_pcm?

good point, I added a check to not write any more samples until the callback drains the buffer if the buffer is too full. So now the clients can feed at any time.

@seanavery seanavery left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Thanks for adding the backpressure handling.

Would it be possible to add a test case to verify backpressure is working as intended?

Comment thread src/speaker.cpp
}
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
playback_context->write_sample(samples[i]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok to write_sample outside of the stream_mu lock?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, write_sample is atomic

Comment thread src/speaker.cpp
for (size_t i = 0; i < num_samples; ++i) {
while (playback_context->get_write_position() - playback_context->playback_position.load() >= max_ahead) {
if (stop_requested_.load()) {
return i;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[possible nit] Does this complicate the assumption that 0 return means context swap? Maybe need to update comments

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah the 0 = swap is kind of awkward so I refactored to just always return the num samples written

@oliviamiller

Copy link
Copy Markdown
Collaborator Author

Looking good! Thanks for adding the backpressure handling.

Would it be possible to add a test case to verify backpressure is working as intended?

added test

@oliviamiller oliviamiller requested a review from seanavery June 2, 2026 21:39

@seanavery seanavery left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, other than the one nit here

Comment thread src/speaker.cpp
}
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
playback_context->write_sample(samples[i]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[possible nit] Do we also need to check for stop at the top of the for loop here so we do not inadvertently write_sample here after stop?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its fine if we write a sample after stop, we will clear all samples shortly after

@oliviamiller oliviamiller merged commit 3dc2973 into viam-modules:main Jun 8, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants