Skip to content

fix(voice): make calls actually deliver media — five root-cause fixes#673

Merged
intendednull merged 1 commit into
mainfrom
fix/voice-video-media
Jun 16, 2026
Merged

fix(voice): make calls actually deliver media — five root-cause fixes#673
intendednull merged 1 commit into
mainfrom
fix/voice-video-media

Conversation

@intendednull

Copy link
Copy Markdown
Owner

Problem

Joining a voice/video call worked (roster visible), but no audio/video ever reached the remote peer — mic inaudible, camera/screen-share invisible remotely while the local preview worked.

Investigation

Multi-agent root-cause review + reference research, written up in docs/reports/2026-06-07-voice-media-connectivity-investigation.md (includes how Jami/SimpleX/Matrix/Tox/iroh-roq solve media transport, and the verification trail). Five independent, compounding bugs:

# Bug Effect
RC0 Voice wire messages addressed by channel name; SEC-V-03 gates validate UUID keys All voice presence + signaling silently dropped
RC2 4 KB SIGNALING_CAP on VoiceSignal Video SDP (5–15 KB) silently dropped → no answer ever
RC3 Early remote ICE candidates rejected (remoteDescription null) + error swallowed Missing candidate pairs → no media even when SDP exchanged
RC4 RefCell<VoiceManager> borrow held across await Double-borrow panic mid-negotiation under trickle ICE
PN Collision check missing signalingState != stable Screen-share renegotiation breaks under glare

Fixes

  • RC0: wire carries canonical channel_id (UUID) — senders resolve via channel_id_for_voice; listener gates resolve id→name; voice state/UI stay name-keyed. Security tests updated to the UUID-on-wire contract.
  • RC2: dedicated 64 KB SDP_CAP for VoiceSignal (oversize still rejected).
  • RC3: per-peer PendingIceCandidates queue, flushed after setRemoteDescription; rejections logged.
  • RC4: VoiceManager rewritten to interior mutability with &self methods; handle is Rc<VoiceManager> (no outer RefCell); #[allow(clippy::await_holding_refcell_ref)] removed so the lint enforces the invariant.
  • PN: canonical should_ignore_offer(polite, making_offer, stable); self-offer filter; ICE/connection-state logging; TURN credential plumbing (window.__WILLOW_ICE_SERVERS with {urls, username, credential} — previously TURN could not be configured at all).

Not in scope (follow-up, needs infra + spec): cross-NAT traversal. Default stays privacy-first empty iceServers (#179); recommendation is self-hosted coturn beside the relay — see report §5b. The old "iroh relay path for ICE" TODO was researched and is infeasible for browsers.

Tests

  • Native: ~10 KB video SDP survives pack_wire/unpack_wire; oversize rejected; all listener security tests green on the new contract.
  • wasm-pack: ICE buffer queue/drain, collision rule, TURN credential propagation, ICE-config knobs.
  • New e2e/voice-video.spec.ts under a voice-chrome Playwright project (fake media devices): two real browsers negotiate over loopback — asserts remote audio both directions and screen-share via renegotiation. Both pass (56.9s). Failed before the fixes (caught RC0).
  • just check green (fmt, clippy -D warnings, all tests, wasm).

🤖 Generated with Claude Code

Voice/video calls connected (roster visible) but no audio/video ever
reached the remote peer. Root-cause investigation
(docs/reports/2026-06-07-voice-media-connectivity-investigation.md)
found five independent, compounding bugs; all but cross-NAT traversal
are fixed here and proven by a new 2-peer Playwright media test.

RC0 — voice wire addressed by channel name, gates validate UUIDs.
VoiceJoin/VoiceLeave/VoiceSignal carried the UI's channel *name*, but
the SEC-V-03 existence gates check ServerState.channels (keyed by
UUID), so every voice message was dropped — presence and signaling
never crossed at all. The wire now carries the canonical channel_id:
senders resolve name->id (channel_id_for_voice), the listener gates
resolve id->name and keep voice state / ClientEvents name-keyed to
match the UI. Security tests updated to the UUID-on-wire contract.

RC2 — 4 KB SIGNALING_CAP silently dropped video SDP. Video offers
run 5-15 KB; unpack_wire discarded them post-decode so the offerer
never got an answer. VoiceSignal now has a dedicated 64 KB SDP_CAP
(still inside the 256 KB transport cap); oversize is still rejected.

RC3 — early remote ICE candidates were lost. addIceCandidate rejects
while remoteDescription is null (browsers don't buffer remote
candidates) and the rejection was swallowed with `let _`. Candidates
arriving before the offer/answer (gossip has no ordering; ICE handling
was synchronous while offer/answer were spawned) are now queued in a
per-peer PendingIceCandidates buffer and flushed after
setRemoteDescription resolves; rejections are logged.

RC4 — RefCell<VoiceManager> borrow held across await. The "safe on
single-threaded WASM, no preemption" comment was wrong: await is a
yield point, and a concurrent VoiceSignal task could double-borrow and
panic mid-negotiation. VoiceManager now owns its state behind interior
mutability with &self methods; the handle is Rc<VoiceManager> (no
outer RefCell), and the clippy::await_holding_refcell_ref allows are
gone, so the lint enforces the invariant from now on.

Perfect negotiation — the collision check tested only making_offer,
not signalingState != stable (the canonical MDN condition), which
breaks renegotiation (screen share added mid-call). Extracted as the
pure should_ignore_offer(polite, making_offer, stable) and fixed.
Also: self-offer filter (don't offer to our own echoed VoiceJoin),
ICE/connection state-change logging (failures were invisible), and
TURN credential plumbing — build_rtc_config only ever set urls, so it
could not drive TURN even if configured; a new
window.__WILLOW_ICE_SERVERS knob carries {urls, username, credential}
(legacy __WILLOW_STUN_URLS still honored).

NOT fixed here (follow-up, needs infra + spec): cross-NAT traversal.
iceServers stays empty by default (privacy-first, issue #179); the
plan is self-hosted coturn beside the relay — see the report's
recommendation. Same-host/LAN calls work without it; the "iroh relay
path for ICE" idea was researched and is infeasible for browsers
(relay-only WebSocket, not TURN/ICE-compatible).

Tests: ~10 KB SDP wire round-trip (native); ICE buffer, collision
rule, TURN credential propagation, ICE-server config knobs
(wasm-pack); new e2e/voice-video.spec.ts under a voice-chrome
Playwright project with fake media devices proving remote audio both
ways + screen-share renegotiation over a real RTCPeerConnection.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@intendednull intendednull merged commit 0dccc82 into main Jun 16, 2026
8 of 9 checks passed
@intendednull intendednull deleted the fix/voice-video-media branch June 16, 2026 02:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant