fix(sender): scope Thunderbolt dials to the bridge interface and surface real connect errors#130
Open
jasontitus wants to merge 4 commits into
Open
Conversation
…ion parsing Adds a TBDisplaySenderTests unit bundle (hosted by the app for @testable access) with 49 tests over the pure logic that has no hardware dependency: - TBMonitorProtocol: BE32 primitives, packet layout, drainPacket handling of fragmented/contiguous/split feeds, JSON payload round-trips, and parity of the hand-rolled input-event encoder with JSONDecoder — guarding the invariant documented on makeInputEventPacket (PR swellweb#123). - TBDiscoveredReceiver: per-transport ip(for:) selection (which IP the sender dials for Thunderbolt vs Network Link), id, shortHostName, displayText. - TBSenderAutomation: parseTransport/parseMode/parsePreset aliases, receiver matching, and the resolveSessionIndex tri-state. The automation parsing helpers move from private to internal so the test bundle can exercise them; behavior is unchanged. project.pbxproj and the shared TBDisplaySender scheme are regenerated via xcodegen so a fresh checkout can run `xcodebuild test` directly. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Executes the new TBDisplaySenderTests suite after the sender build. - Triggers CI on 3.1-maint and 3.2-dev in addition to main, so the branches that actually receive PRs get build/test coverage. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ket types Two wire-protocol drain hardenings in TBMonitorProtocol.drainPacket, matching the receiver parser's behavior (net.c caps packets at 64 MiB and treats a bad length as fatal): - A corrupt length prefix (zero, or above the new 64 MiB maxPacketLength) now throws instead of returning "need more data". Previously a corrupted length such as 0xFFFFFFFF made the sender buffer inbound bytes forever, waiting for a packet that could never complete — unbounded memory growth on a corrupt stream. The drain loop in TBDisplaySenderService now closes the connection and surfaces the reason in the session status. - A packet with an unrecognized type byte (e.g. from a newer receiver) is now skipped and draining continues. Previously it consumed the packet but returned nil, which the caller treated as "buffer empty" — stalling every valid packet queued behind the unknown one until the next network read. Covered by 7 new unit tests (zero/oversized/all-ones lengths, cap boundary, corruption behind a valid packet, unknown-type skip and lone unknown-type). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ace real connect errors
Root cause, observed in the field: connect() pins only requiredLocalEndpoint
(the source ADDRESS), never the egress interface. macOS keeps a single
routing-table entry for 169.254.0.0/16 pointing at the primary interface
(usually Wi-Fi), so a dial to a Thunderbolt Bridge peer's self-assigned
link-local address leaves via the wrong link and black-holes — with both
Macs configured correctly and the TB link healthy. The connection then sits
in .waiting(...) carrying the true reason ("No route to host", "Network is
down"), which the state handler silently dropped, and the 5s watchdog
reported a bare fixed-string "Connection timed out".
Fixes, kept surgical:
- Link-local receiver addresses are now dialed scoped ("169.254.x.y%bridge0")
to the interface that owns the session's local IP, so routing happens on
the Thunderbolt link regardless of the routing table. Non-link-local dials
are unchanged. Falls back to the unscoped host if the scoped form does not
parse. This makes plain DHCP/link-local Thunderbolt Bridge setups (the
macOS default) work without manual static-IP workarounds.
- The .waiting(error) state is now handled: logged, and remembered so the
watchdog/failure paths can report it.
- Timeout and failure statuses now append where we dialed, from which local
IP/interface, the transport, and the last network state, via the pure
TBConnectionDiagnostics.failureDetail composer.
- New os.Logger subsystem com.targetbridge.sender (category "connection")
traces dial/ready/waiting/failed/timeout, so field issues can be triaged
with `log stream --predicate 'subsystem == "com.targetbridge.sender"'`.
Covered by 9 new unit tests for the scoping and detail-composition logic.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation — diagnosed on real hardware
Field setup: M2 MacBook Pro → Apple TB3→TB2 adapter → 2015 iMac 5K receiver. Thunderbolt link healthy, both bridges up with self-assigned link-local IPs (the macOS default: DHCP on Thunderbolt Bridge finds no server and self-assigns
169.254.x), receiver advertising over Bonjour. Every GUI connect: "Connection failed: Connection timed out" with no further information.Root cause chain, verified with
route -n get+ packet-level probes:connect()pins onlyrequiredLocalEndpoint— the source address, not the egress interface.169.254.0.0/16pointing at the primary interface (usually Wi-Fi). The dial to the receiver's bridge-only address left via Wi-Fi and black-holed..waiting("No route to host")— the one state the handler didn't handle — so the true reason was discarded and the fixed-string 5s watchdog reported a bare timeout.So the default Thunderbolt Bridge configuration fails while everything is configured correctly, and the error hides the cause. (Manual workaround until now: static IPs on both bridges.)
What's in it
"169.254.x.y%bridge0"(IPv4Addresscarries the interface, so Network.framework routes on the Thunderbolt link regardless of the routing table). Non-link-local dials unchanged; falls back to the unscoped host if the scoped form doesn't parse..waiting(error)is handled: logged and remembered, so timeouts/failures report it.dialed <ip>:<port> from <localIP> (<interface>) [<transport>] — last network state: …via a pure, unit-tested composer.os.Loggersubsystemcom.targetbridge.sender(categoryconnection) traces dial/ready/waiting/failed/timeout — field triage becomeslog stream --predicate 'subsystem == "com.targetbridge.sender"'.How tested
IPv4Address("169.254.89.80%bridge0")parses with the interface attached and that a bogus interface name safely yields nil (→ unscoped fallback)..waitingcarries the actionable reason ("No route to host" / "Network is down") that this PR stops discarding.🤖 Generated with Claude Code