Skip to content

#53 streaming playback PR2: cpal audio + MJPEG transport (off-by-default)#170

Merged
appergb merged 13 commits into
mainfrom
feat/53-cpal-mjpeg
Jun 28, 2026
Merged

#53 streaming playback PR2: cpal audio + MJPEG transport (off-by-default)#170
appergb merged 13 commits into
mainfrom
feat/53-cpal-mjpeg

Conversation

@appergb

@appergb appergb commented Jun 28, 2026

Copy link
Copy Markdown
Owner

#53 streaming playback engine — PR2 (#63 cpal + #64 MJPEG), stacked on PR1 (#165, merged). Adds a cpal audio master clock + real-time multi-track mixdown (mirrors the merged export.rs mixdown) + loopback MJPEG transport (multipart/x-mixed-replace, Origin-guarded, bounded channels that drop rather than back-pressure), all behind the same off-by-default playback-engine feature. Default build unchanged (verified: default cargo check pulls no cpal/axum). 82 feature-gated tests pass locally on GPU+ffmpeg. Frontend stays flag+isTauri+play gated.

Independently reviewed: faithful AVPlayer master-clock port; audio mixdown identical to export (consistent internal invariant). Follow-ups tracked separately: destructive pause, audio speed-resample + overlap-skip (shared with export), #65 Lottie (PR3).

robot Generated with Claude Code

baiqing added 13 commits June 29, 2026 00:49
PR2 (part 1) of the Rust streaming playback engine. Wires PR1's headless render
engine to the WebView over a loopback MJPEG stream + Tauri commands, behind the
off-by-default `playback-engine` feature.

- transport.rs: loopback axum MJPEG server (127.0.0.1:0, multipart/x-mixed-
  replace, broadcast(2) drop-when-full) fed by the render thread via MjpegSink
  (FrameSink), guarded by an Origin/loopback check; TauriPlayheadEmitter emits a
  `playback_frame` event per rendered frame so the front end can move its
  playhead while the pixels arrive over the stream.
- commands.rs: PlaybackState + playback_start/pause/stop/seek +
  get_preview_endpoint. start snapshots the session, spawns the engine, and
  keeps it (plus the audio handle) so the other commands can drive/stop it.
- audio.rs: build_clock decides the master clock — wall-clock InstantClock now;
  the cpal audio master clock (#63) is the documented follow-up. AudioPlayback
  owns the audio-device lifetime so wiring it later needs no refactor.
- lib.rs: feature-gated setup (start PreviewServer on the Tauri async runtime,
  manage PlaybackState) + the 5 commands registered with per-entry #[cfg].
- Cargo.toml: axum/tokio/bytes/futures (workspace-resolved versions) + the
  image `jpeg` feature, all under `playback-engine`; no new third-party crates.

Video now plays through the Rust streaming compositor (multi-track / ProRes /
effects visible during PLAY), wall-clock timed. Remaining for PR2: the cpal
audio master clock for A/V sync (#63). Front-end PLAY-switch + Lottie (#65) land
in PR3. CSP stays null (already permits the loopback <img>); CSP hardening is a
follow-up needing real-machine verification.

23 unit tests pass (drain/clock/projection + origin-guard/jpeg/render-size);
`clippy -D warnings` clean with and without the feature.
Completes the PR2 backend: a timeline with sound now drives playback from the
audio device, so video follows audio (the #53 A/V-sync acceptance).

- audio.rs: pre-mix the whole timeline to one mono buffer at the cpal device
  sample rate (reusing the proven export mixdown — extract_pcm + mix_clips,
  parameterised by rate), play it on a dedicated cpal output thread whose
  callback copies buffer[pos..] to every output channel and advances an
  AtomicU64 master clock (lock-free; no decode in the real-time callback).
  AudioClock exposes pos as the PlaybackClock; a silent timeline (or no/failed
  audio device) falls back to the wall-clock InstantClock so video still plays.
- build_clock now returns the real AudioClock + a live AudioPlayback (its Drop
  stops the cpal stream + joins the thread); the command layer already owns it.
- Cargo.toml: cpal 0.15 under the `playback-engine` feature (CoreAudio on
  macOS, ALSA on Linux).
- ci.yml: install libasound2-dev for the playback-engine job so cpal's ALSA
  backend builds on the Linux runner.

Mono preview audio + an up-front mix decode are intentional for this cut
(stereo / chunked-streaming / background decode are follow-ups). 28 unit tests
pass (adds audio clock + mix-window math); clippy -D warnings clean with and
without the feature. Real-machine audio + A/V-sync confirmation (tauri build +
listen) is the remaining acceptance step.
PR3 of the streaming playback engine: PLAY routes to the Rust engine
(continuous decode → wgpu composite → MJPEG + cpal master clock) when enabled,
while scrub/pause stay on the legacy <video> path — the pause-freeze (74c4c82)
and resume-without-force-seek (5fa3f6f) fixes are left untouched.

- previewEngine.ts: a guarded branch at the top of the play/scrub effect —
  rustEngineEnabled() && isTauri && isPlaying && !isScrubbing → pauseAll() the
  <video> followers, playbackStart(activeFrame), and drive the playhead from
  `playback_frame` events (stop at the last frame). Its cleanup stops the engine
  AND seeks the <video> followers to the final frame, so the OTHER effect's
  pause-snap freezes on the correct frame, not a stale one. Everything else is
  the legacy path, unchanged.
- Preview.tsx: TimelineRustOverlay paints the loopback MJPEG <img> over
  <TimelinePlayback> during Rust PLAY (fills the aspect-fit canvas); it unmounts
  on pause/scrub so the legacy surface returns.
- api.ts: playbackStart/pause/stop/seek, getPreviewEndpoint, onPlaybackFrame,
  mirroring the existing invoke/listen patterns; no-ops outside Tauri.
- rustEngine.ts: rustEngineEnabled() runtime flag (localStorage
  'opentake.rustEngine'='1'), OFF by default — flip in devtools to A/B the two
  paths on a real machine; default-on is a follow-up after confirmation.

tsc + vite build pass; 174 vitest tests pass (incl. the previewEngine
pause/resume regression tests). The Rust PLAY path runs only with the flag on
under Tauri, so its real-machine acceptance (smooth multi-track / ProRes
playback + A/V sync) is the remaining verification step. Lottie (#65) is filed
separately — opentake-motion has no Lottie rasterizer, so baking is a standalone
feature, not a small addition.
…ustness (#53)

Adversarial-review fixes across PR2/PR3 of the streaming playback engine.

Rust:
- commands.rs: playback_start is now async and decodes/mixes audio off the IPC
  thread via spawn_blocking, so starting playback on a long project no longer
  freezes the UI. install() stops the previous session AFTER releasing the lock,
  so a slow render-thread join can't block the other playback commands.
- audio.rs: cover every cpal sample format (I8/I16/I32/I64/U8/U16/U32/U64/F32/
  F64) — a non-F32 default device (common on Linux/Windows) now gets audio
  instead of silently falling back to the wall clock. AudioClock::seek uses the
  same float path as frame() so a seek round-trips exactly at non-divisible
  rate/fps.
- transport.rs: the MJPEG relay uses a BOUNDED channel and packs header+body
  into one atomic multipart part — a slow client drops frames instead of growing
  memory unbounded, and a dropped half can't corrupt the stream.

Front end:
- previewEngine.ts: the pause-snap no longer depends on React effect ordering —
  in Rust-engine mode it trusts the authoritative activeFrame instead of reading
  a stale <video> currentTime, removing the 74c4c82 regression risk. The
  playback_frame handler bails if disposed and re-reads the timeline end so a
  mid-play edit can't stop early; playbackStart/Stop errors are logged, not
  swallowed (the project's "no silent IPC failure" rule).
- Preview.tsx: guard the MJPEG endpoint against a non-string value.

69 Rust unit tests + 174 vitest tests pass; clippy -D warnings clean with and
without the feature; web build green.
…#162)

During Rust streaming PLAY, a keyboard step or transport-bar click jumped
activeFrame away from the engine's per-frame updates, but the switch effect
(deps [isPlaying, isScrubbing]) didn't react, so the seek was ignored. Add a
dedicated watcher effect that distinguishes an external jump from the engine's
own per-frame advance via the pure isExternalSeekWhilePlaying helper, and
forwards it with playback_seek. Scrub still relinquishes to the legacy <video>
path as before.

178 vitest tests pass (adds 4 for the detection helper); web build green.
Replace the mono preview mixdown with interleaved stereo, so video preview plays
the project's left/right channels instead of a mono fold.

- opentake-media/decode/audio_stream.rs (new): decode_pcm_interleaved decodes a
  clip's audio window to interleaved f32 at a target rate WITHOUT folding to mono
  (deliberately not reusing pcm.rs::raw_to_mono_f32). 5 unit tests on the ffmpeg
  args + the byte→f32 reinterpret.
- src-tauri/playback/audio.rs: pre-mix the timeline to one interleaved stereo
  buffer (per-channel sum + per-frame gain + hard-limit), and the cpal callback
  maps it to the device channel count (mono = average, stereo = L/R, >2 = L/R
  then silence) via a pure, unit-tested write_frame. AudioClock's position is now
  in output audio frames (formulas unchanged).

Still preload (the chunked / background-filled streaming half of #160 remains).
72 src-tauri unit tests + 5 media tests pass; clippy -D warnings clean with and
without the feature. Final stereo sound is verified on a real machine (flag on).
The transport had unit coverage (origin guard, JPEG encode, multipart framing)
but nothing exercised the live axum server. Add a feature-gated integration test
that starts PreviewServer through the Tauri async runtime and, with a blocking
std TCP client + a raw HTTP/1.1 request, asserts: (1) GET /stream → 200 with
`multipart/x-mixed-replace` + the boundary; (2) a cross-origin Origin header →
403. No HTTP-client dependency; stable across repeated runs.
…53)

Pull the clock-frame clamp + end-of-timeline decision out of the render thread
into a pure loop_step(clock_frame, total) -> (target, done) and unit-test the
boundary (last frame, past-end clamp, negative clamp, single-frame timeline).
Behavior unchanged; the termination logic is now verified independently of the
GPU loop. 73 src-tauri unit tests pass; clippy clean.
…ing engine (#53)

A turnkey handoff doc: the decode→resolver→compositor→MJPEG + cpal-master-clock
architecture, the file map, the compile/runtime feature flags, the exact
real-machine acceptance checklist (enable via localStorage, tauri build, verify
multi-track/ProRes/effects/A-V-sync, then default-on), and the documented
tradeoffs / follow-ups (#160 chunked audio, #161 CSP, #65 Lottie).
…c ordering (#53 #160 #162)

Second adversarial-review pass over tonight's new code (stereo audio + the #162
seek watcher). The reviewers found NO correctness bug in the stereo channel
mapping / interleaving / real-time callback or the watcher's feedback-loop guard;
these are the polish items:

- audio.rs: AudioClock::seek rounds (not truncates) so a seek round-trips back to
  the same frame at a non-divisible device rate (e.g. 44100 Hz @ 24 fps) —
  truncation landed a half-sample short and frame() reported frame-1. New
  regression test at 44100/24. The cpal position store/fetch_add use
  Release/AcqRel (defensive on ARM / Apple Silicon).
- previewEngine.ts: document the invariant that the engine frame is recorded
  BEFORE setActiveFrame (the seek watcher depends on the order — reordering would
  cause a feedback loop).
- timelinePlayback.ts: document that a sub-epsilon external nudge during PLAY is
  intentionally not forwarded (playback supersedes it) + an epsilon-boundary test.

74 src-tauri unit tests + 179 vitest pass; clippy -D warnings clean.
…53)

The "route PLAY to the Rust engine?" condition (flag on && Tauri && playing &&
!scrubbing) was copy-pasted across the switch effect, the external-seek watcher,
and the MJPEG overlay. Extract it into a pure, tested shouldUseRustEngine(...) so
the regression-prone gate lives (and is covered) in one place — matching the
project's extract-pure-helper hook-testing convention. 5 new vitest cases (flag
off / non-Tauri / scrubbing / paused / engaged). Behavior unchanged; 184 vitest
pass, web build green.
The integration tests covered RenderLoop directly but not the threaded
PlaybackEngine + the clock/sink/emitter wiring. Add a GPU+ffmpeg-gated test that
spawns the engine over a real source with an InstantClock and in-memory
sink/emitter, lets the wall clock advance the playhead, and asserts frames reach
the sink and the playhead reaches the emitter before stop joins the render
thread. Passes locally (GPU+ffmpeg); auto-skips on a GPU-less CI runner.
A later cargo fmt --all wrapped a long format! line after the test was first
committed; commit the formatting so the branch HEAD passes cargo fmt --check.
@appergb appergb merged commit abda61a into main Jun 28, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant