Skip to content

[loop cycle 1] perf(dnd): stream dropped files to cut peak memory on large videos#9

Merged
karem505 merged 1 commit into
masterfrom
whatrust-loop/cycle-1-stream-drop-base64
Jun 28, 2026
Merged

[loop cycle 1] perf(dnd): stream dropped files to cut peak memory on large videos#9
karem505 merged 1 commit into
masterfrom
whatrust-loop/cycle-1-stream-drop-base64

Conversation

@karem505

Copy link
Copy Markdown
Owner

What

Fixes the memory blow-up when dropping large videos. build_drop_payload (the Rust side of drag-and-drop) used to hold ~4 simultaneous full copies of the dropped bytes:

std::fs::read (raw Vec<u8>) → base64_encode (full base64 String) → serde_json::json! (copied into a Value) → serde_json::to_string (copied again into the output)

So a 100 MB video drop peaked near half a gigabyte and could OOM or stall the window thread.

How

  • Read each file in 48 KiB chunks and write its base64 once, directly into the output JSON buffer.
  • Manual JSON assembly for the array/object structure; serde_json::to_string still escapes the name/type strings (base64's alphabet needs no JSON escaping).
  • A 0–2 byte carry keeps base64 emission aligned to whole 3-byte groups across reads (only the final group is padded).
  • A mark/truncate rollback keeps out valid JSON if a file read fails partway through streaming its base64.

Peak extra allocation drops from ~4× the file size to ~1.33× the base64 of the single largest file + a 48 KiB buffer. Output shape is byte-for-byte the same [{name,type,b64}] the page injector already expects.

Verification

Gate 1 — deterministic (verify.sh = cargo build --locked + cargo test): PASS — 55 tests (4 new):

  • streaming_base64_matches_oneshot_across_chunk_boundary — parity vs the one-shot encoder for a file just past 48 KiB, for each length-mod-3 case (0/1/2)
  • empty_file_streams_to_empty_base64
  • build_drop_payload_roundtrips_name_type_b64 — parses the JSON, checks name/type/b64 for a png + a 48 KiB+5 mp4
  • build_drop_payload_empty_for_no_files"[]" parity with the old path

Gate 2 — generation-blind code review (separate feature-dev:code-reviewer, told it didn't write the code, default-to-doubt): APPROVE, severity none, no must-fix. Verdict after an exhaustive trace:

  • carry/flush logic provably correct for all read-size sequences; carry[carry_len] can never go out of bounds; base64 never mid-stream-padded
  • mark/truncate always restores valid JSON; no trailing comma across skips/rollbacks
  • behavior parity confirmed (empty → "[]", identical object shape)
  • no resource leak / panic / injection (filename is serde-escaped)

Its one non-blocking note (a redundant BufReader around the already-large read buffer) was applied before this PR — the file is now read directly.

Follow-up (not in this PR)

The base64 still crosses into the webview via eval (one more ~1.33× copy on the JS side). Eliminating that transport entirely — serving the dropped bytes over a custom URI scheme and fetchblobFile in bridge.js — is tracked as backlog item A1 for a later cycle.


🤖 PR-ONLY — do not auto-merge. Releasing whatRust is manual via a v* tag; this loop never merges, bumps the version, or tags a release. Opened by the whatrust-fix-loop (cycle 1/6).

… videos

build_drop_payload previously read the whole file into a Vec<u8>, base64-encoded it
into a String, copied that into a serde_json::Value, then serialized the Value to the
output String — ~4 simultaneous copies, so a 100 MB video drop peaked near half a
gigabyte and could OOM or stall the window thread.

Now each file is read in 48 KiB chunks and its base64 is written once, directly into
the output JSON buffer (manual JSON assembly; serde_json still escapes the name/type
strings). A 0-2 byte carry keeps base64 emission aligned to whole 3-byte groups across
reads (only the final group is padded), and a mark/truncate rollback keeps `out` valid
JSON if a read fails mid-stream. Peak extra allocation drops from ~4x the file size to
~1.33x the base64 of the single largest file plus a fixed 48 KiB read buffer.

Adds tests: streaming-vs-one-shot parity across the read-chunk boundary for each
length-mod-3 case, empty file, full payload round-trip (name/type/b64), and empty input.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016o9cWBaPy4zU4BAurUVoTp
karem505 added a commit that referenced this pull request Jun 28, 2026
Bundles six loop-shipped fixes:
- perf(dnd): stream dropped files as base64 to cut peak memory on large videos (#9)
- fix(notifications): forward service-worker showNotification to the native toast (#10)
- feat(calls): expose a minimal window.chrome so WhatsApp enables call buttons (#11)
- fix(notifications): de-duplicate burst-repeated native toasts (#12)
- fix(dnd): route AVIF/HEIF photos as photos; broaden MIME labels (#13)
- fix(calls/capability): return a complete Chrome high-entropy client-hints set (#14)
Plus integration: SW notifications share the dedup window; base64_encode is test-only now.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016o9cWBaPy4zU4BAurUVoTp
@karem505 karem505 merged commit 709e645 into master Jun 28, 2026
6 checks passed
@karem505 karem505 deleted the whatrust-loop/cycle-1-stream-drop-base64 branch June 28, 2026 01:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant