#53 streaming playback PR2: cpal audio + MJPEG transport (off-by-default)#170
Merged
Conversation
added 13 commits
June 29, 2026 00:49
PR2 (part 1) of the Rust streaming playback engine. Wires PR1's headless render engine to the WebView over a loopback MJPEG stream + Tauri commands, behind the off-by-default `playback-engine` feature. - transport.rs: loopback axum MJPEG server (127.0.0.1:0, multipart/x-mixed- replace, broadcast(2) drop-when-full) fed by the render thread via MjpegSink (FrameSink), guarded by an Origin/loopback check; TauriPlayheadEmitter emits a `playback_frame` event per rendered frame so the front end can move its playhead while the pixels arrive over the stream. - commands.rs: PlaybackState + playback_start/pause/stop/seek + get_preview_endpoint. start snapshots the session, spawns the engine, and keeps it (plus the audio handle) so the other commands can drive/stop it. - audio.rs: build_clock decides the master clock — wall-clock InstantClock now; the cpal audio master clock (#63) is the documented follow-up. AudioPlayback owns the audio-device lifetime so wiring it later needs no refactor. - lib.rs: feature-gated setup (start PreviewServer on the Tauri async runtime, manage PlaybackState) + the 5 commands registered with per-entry #[cfg]. - Cargo.toml: axum/tokio/bytes/futures (workspace-resolved versions) + the image `jpeg` feature, all under `playback-engine`; no new third-party crates. Video now plays through the Rust streaming compositor (multi-track / ProRes / effects visible during PLAY), wall-clock timed. Remaining for PR2: the cpal audio master clock for A/V sync (#63). Front-end PLAY-switch + Lottie (#65) land in PR3. CSP stays null (already permits the loopback <img>); CSP hardening is a follow-up needing real-machine verification. 23 unit tests pass (drain/clock/projection + origin-guard/jpeg/render-size); `clippy -D warnings` clean with and without the feature.
Completes the PR2 backend: a timeline with sound now drives playback from the audio device, so video follows audio (the #53 A/V-sync acceptance). - audio.rs: pre-mix the whole timeline to one mono buffer at the cpal device sample rate (reusing the proven export mixdown — extract_pcm + mix_clips, parameterised by rate), play it on a dedicated cpal output thread whose callback copies buffer[pos..] to every output channel and advances an AtomicU64 master clock (lock-free; no decode in the real-time callback). AudioClock exposes pos as the PlaybackClock; a silent timeline (or no/failed audio device) falls back to the wall-clock InstantClock so video still plays. - build_clock now returns the real AudioClock + a live AudioPlayback (its Drop stops the cpal stream + joins the thread); the command layer already owns it. - Cargo.toml: cpal 0.15 under the `playback-engine` feature (CoreAudio on macOS, ALSA on Linux). - ci.yml: install libasound2-dev for the playback-engine job so cpal's ALSA backend builds on the Linux runner. Mono preview audio + an up-front mix decode are intentional for this cut (stereo / chunked-streaming / background decode are follow-ups). 28 unit tests pass (adds audio clock + mix-window math); clippy -D warnings clean with and without the feature. Real-machine audio + A/V-sync confirmation (tauri build + listen) is the remaining acceptance step.
PR3 of the streaming playback engine: PLAY routes to the Rust engine (continuous decode → wgpu composite → MJPEG + cpal master clock) when enabled, while scrub/pause stay on the legacy <video> path — the pause-freeze (74c4c82) and resume-without-force-seek (5fa3f6f) fixes are left untouched. - previewEngine.ts: a guarded branch at the top of the play/scrub effect — rustEngineEnabled() && isTauri && isPlaying && !isScrubbing → pauseAll() the <video> followers, playbackStart(activeFrame), and drive the playhead from `playback_frame` events (stop at the last frame). Its cleanup stops the engine AND seeks the <video> followers to the final frame, so the OTHER effect's pause-snap freezes on the correct frame, not a stale one. Everything else is the legacy path, unchanged. - Preview.tsx: TimelineRustOverlay paints the loopback MJPEG <img> over <TimelinePlayback> during Rust PLAY (fills the aspect-fit canvas); it unmounts on pause/scrub so the legacy surface returns. - api.ts: playbackStart/pause/stop/seek, getPreviewEndpoint, onPlaybackFrame, mirroring the existing invoke/listen patterns; no-ops outside Tauri. - rustEngine.ts: rustEngineEnabled() runtime flag (localStorage 'opentake.rustEngine'='1'), OFF by default — flip in devtools to A/B the two paths on a real machine; default-on is a follow-up after confirmation. tsc + vite build pass; 174 vitest tests pass (incl. the previewEngine pause/resume regression tests). The Rust PLAY path runs only with the flag on under Tauri, so its real-machine acceptance (smooth multi-track / ProRes playback + A/V sync) is the remaining verification step. Lottie (#65) is filed separately — opentake-motion has no Lottie rasterizer, so baking is a standalone feature, not a small addition.
…ustness (#53) Adversarial-review fixes across PR2/PR3 of the streaming playback engine. Rust: - commands.rs: playback_start is now async and decodes/mixes audio off the IPC thread via spawn_blocking, so starting playback on a long project no longer freezes the UI. install() stops the previous session AFTER releasing the lock, so a slow render-thread join can't block the other playback commands. - audio.rs: cover every cpal sample format (I8/I16/I32/I64/U8/U16/U32/U64/F32/ F64) — a non-F32 default device (common on Linux/Windows) now gets audio instead of silently falling back to the wall clock. AudioClock::seek uses the same float path as frame() so a seek round-trips exactly at non-divisible rate/fps. - transport.rs: the MJPEG relay uses a BOUNDED channel and packs header+body into one atomic multipart part — a slow client drops frames instead of growing memory unbounded, and a dropped half can't corrupt the stream. Front end: - previewEngine.ts: the pause-snap no longer depends on React effect ordering — in Rust-engine mode it trusts the authoritative activeFrame instead of reading a stale <video> currentTime, removing the 74c4c82 regression risk. The playback_frame handler bails if disposed and re-reads the timeline end so a mid-play edit can't stop early; playbackStart/Stop errors are logged, not swallowed (the project's "no silent IPC failure" rule). - Preview.tsx: guard the MJPEG endpoint against a non-string value. 69 Rust unit tests + 174 vitest tests pass; clippy -D warnings clean with and without the feature; web build green.
…#162) During Rust streaming PLAY, a keyboard step or transport-bar click jumped activeFrame away from the engine's per-frame updates, but the switch effect (deps [isPlaying, isScrubbing]) didn't react, so the seek was ignored. Add a dedicated watcher effect that distinguishes an external jump from the engine's own per-frame advance via the pure isExternalSeekWhilePlaying helper, and forwards it with playback_seek. Scrub still relinquishes to the legacy <video> path as before. 178 vitest tests pass (adds 4 for the detection helper); web build green.
Replace the mono preview mixdown with interleaved stereo, so video preview plays the project's left/right channels instead of a mono fold. - opentake-media/decode/audio_stream.rs (new): decode_pcm_interleaved decodes a clip's audio window to interleaved f32 at a target rate WITHOUT folding to mono (deliberately not reusing pcm.rs::raw_to_mono_f32). 5 unit tests on the ffmpeg args + the byte→f32 reinterpret. - src-tauri/playback/audio.rs: pre-mix the timeline to one interleaved stereo buffer (per-channel sum + per-frame gain + hard-limit), and the cpal callback maps it to the device channel count (mono = average, stereo = L/R, >2 = L/R then silence) via a pure, unit-tested write_frame. AudioClock's position is now in output audio frames (formulas unchanged). Still preload (the chunked / background-filled streaming half of #160 remains). 72 src-tauri unit tests + 5 media tests pass; clippy -D warnings clean with and without the feature. Final stereo sound is verified on a real machine (flag on).
The transport had unit coverage (origin guard, JPEG encode, multipart framing) but nothing exercised the live axum server. Add a feature-gated integration test that starts PreviewServer through the Tauri async runtime and, with a blocking std TCP client + a raw HTTP/1.1 request, asserts: (1) GET /stream → 200 with `multipart/x-mixed-replace` + the boundary; (2) a cross-origin Origin header → 403. No HTTP-client dependency; stable across repeated runs.
…53) Pull the clock-frame clamp + end-of-timeline decision out of the render thread into a pure loop_step(clock_frame, total) -> (target, done) and unit-test the boundary (last frame, past-end clamp, negative clamp, single-frame timeline). Behavior unchanged; the termination logic is now verified independently of the GPU loop. 73 src-tauri unit tests pass; clippy clean.
…ing engine (#53) A turnkey handoff doc: the decode→resolver→compositor→MJPEG + cpal-master-clock architecture, the file map, the compile/runtime feature flags, the exact real-machine acceptance checklist (enable via localStorage, tauri build, verify multi-track/ProRes/effects/A-V-sync, then default-on), and the documented tradeoffs / follow-ups (#160 chunked audio, #161 CSP, #65 Lottie).
…c ordering (#53 #160 #162) Second adversarial-review pass over tonight's new code (stereo audio + the #162 seek watcher). The reviewers found NO correctness bug in the stereo channel mapping / interleaving / real-time callback or the watcher's feedback-loop guard; these are the polish items: - audio.rs: AudioClock::seek rounds (not truncates) so a seek round-trips back to the same frame at a non-divisible device rate (e.g. 44100 Hz @ 24 fps) — truncation landed a half-sample short and frame() reported frame-1. New regression test at 44100/24. The cpal position store/fetch_add use Release/AcqRel (defensive on ARM / Apple Silicon). - previewEngine.ts: document the invariant that the engine frame is recorded BEFORE setActiveFrame (the seek watcher depends on the order — reordering would cause a feedback loop). - timelinePlayback.ts: document that a sub-epsilon external nudge during PLAY is intentionally not forwarded (playback supersedes it) + an epsilon-boundary test. 74 src-tauri unit tests + 179 vitest pass; clippy -D warnings clean.
…53) The "route PLAY to the Rust engine?" condition (flag on && Tauri && playing && !scrubbing) was copy-pasted across the switch effect, the external-seek watcher, and the MJPEG overlay. Extract it into a pure, tested shouldUseRustEngine(...) so the regression-prone gate lives (and is covered) in one place — matching the project's extract-pure-helper hook-testing convention. 5 new vitest cases (flag off / non-Tauri / scrubbing / paused / engaged). Behavior unchanged; 184 vitest pass, web build green.
The integration tests covered RenderLoop directly but not the threaded PlaybackEngine + the clock/sink/emitter wiring. Add a GPU+ffmpeg-gated test that spawns the engine over a real source with an InstantClock and in-memory sink/emitter, lets the wall clock advance the playhead, and asserts frames reach the sink and the playhead reaches the emitter before stop joins the render thread. Passes locally (GPU+ffmpeg); auto-skips on a GPU-less CI runner.
A later cargo fmt --all wrapped a long format! line after the test was first committed; commit the formatting so the branch HEAD passes cargo fmt --check.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
#53 streaming playback engine — PR2 (#63 cpal + #64 MJPEG), stacked on PR1 (#165, merged). Adds a cpal audio master clock + real-time multi-track mixdown (mirrors the merged export.rs mixdown) + loopback MJPEG transport (multipart/x-mixed-replace, Origin-guarded, bounded channels that drop rather than back-pressure), all behind the same off-by-default
playback-enginefeature. Default build unchanged (verified: default cargo check pulls no cpal/axum). 82 feature-gated tests pass locally on GPU+ffmpeg. Frontend stays flag+isTauri+play gated.Independently reviewed: faithful AVPlayer master-clock port; audio mixdown identical to export (consistent internal invariant). Follow-ups tracked separately: destructive pause, audio speed-resample + overlap-skip (shared with export), #65 Lottie (PR3).
robot Generated with Claude Code