Redesign NUClearNet and rename to nuclearnet#190
Open
TrentHouliston wants to merge 18 commits into
Open
Conversation
… and Fragmentation Add an optional 'now' parameter (defaulting to steady_clock::now()) to all time-dependent methods in Discovery, Reliability, and Fragmentation. This allows tests to advance time deterministically without sleeping, making them faster, more reliable, and immune to CI timing variability. Tests converted from time-based (sleep_for) to event-based: - Discovery: check_timeouts and touch_peer tests - Reliability: all retransmission/timeout tests (7 sleeps removed) - Fragmentation: cleanup_expired test The only remaining sleep_for is in Integration.cpp's polling loop for genuine async UDP networking, which is inherently time-dependent.
The has_ipv4_multicast() and has_ipv6_multicast() functions previously only checked if network interfaces reported the IFF_MULTICAST flag. On macOS GitHub Actions runners (virtualized ARM64 VMs), interfaces report multicast capability but the hypervisor doesn't actually deliver multicast packets. Now performs an actual multicast send/receive round-trip test with a 200ms timeout. This correctly detects broken multicast environments and causes those tests to be skipped rather than hanging.
There was a problem hiding this comment.
Pull request overview
This PR replaces the legacy monolithic NUClearNetwork implementation under src/extension/network/ with a new modular, standalone src/nuclearnet/ library, and updates the reactor-facing NetworkController to use the new API. It also adds a comprehensive Catch2 unit/integration test suite for the new components and improves multicast capability detection for test gating.
Changes:
- Introduces the new
src/nuclearnet/standalone library (Discovery, Fragmentation, Reliability, RTT estimation, routing, deduplication, wire protocol, RAII FD wrapper). - Migrates
src/extension/NetworkControllerfromNUClearNetworkto the newNUClearNetAPI, including subscription propagation. - Adds/updates tests and test utilities (including multicast round-trip detection) to cover the new networking components.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
src/nuclearnet/CMakeLists.txt |
Adds standalone CMake build for the new nuclearnet library. |
src/nuclearnet/wire_protocol.hpp |
Defines new on-wire structs and header validation helper. |
src/nuclearnet/RTTEstimator.hpp |
Declares RTT estimator API. |
src/nuclearnet/RTTEstimator.cpp |
Implements Jacobson/Karels-style RTT estimation. |
src/nuclearnet/Routing.hpp |
Declares peer/local subscription tracking and filtering API. |
src/nuclearnet/Routing.cpp |
Implements subscription-based routing decisions. |
src/nuclearnet/Reliability.hpp |
Declares ACK/NACK tracking and retransmission API. |
src/nuclearnet/Reliability.cpp |
Implements retransmission tracking and ACK/NACK processing. |
src/nuclearnet/PacketDeduplicator.hpp |
Declares sliding-window packet deduplication. |
src/nuclearnet/PacketDeduplicator.cpp |
Implements wraparound-safe sliding-window deduplication. |
src/nuclearnet/Fragmentation.hpp |
Declares fragmentation and reassembly API. |
src/nuclearnet/Fragmentation.cpp |
Implements MTU fragmentation, reassembly, and expiry cleanup. |
src/nuclearnet/FileDescriptor.hpp |
Adds RAII wrapper for sockets/file descriptors. |
src/nuclearnet/Discovery.hpp |
Declares announce/leave processing and peer tracking API. |
src/nuclearnet/Discovery.cpp |
Implements peer discovery, timeout handling, and callbacks. |
src/nuclearnet/NUClearNet.hpp |
Declares the public standalone networking façade and callbacks. |
src/nuclearnet/NUClearNet.cpp |
Implements socket setup, polling loop, packet IO, and module integration. |
src/util/network/sock_t.hpp |
Extends sock_t with comparison and stream operators for use in maps/logging. |
src/extension/NetworkController.hpp |
Switches controller to use network::NUClearNet. |
src/extension/NetworkController.cpp |
Adapts controller wiring to the new API and subscriptions model. |
src/CMakeLists.txt |
Uses CONFIGURE_DEPENDS for recursive globbing. |
tests/test_util/has_multicast.cpp |
Adds multicast round-trip probing for more reliable test gating. |
tests/tests/nuclearnet/Discovery.cpp |
Adds unit tests for discovery behaviors. |
tests/tests/nuclearnet/Fragmentation.cpp |
Adds unit tests for fragmentation/reassembly behaviors. |
tests/tests/nuclearnet/Integration.cpp |
Adds integration tests for two peers discovering/exchanging data. |
tests/tests/nuclearnet/PacketDeduplicator.cpp |
Adds unit tests for deduplication window/wraparound. |
tests/tests/nuclearnet/Reliability.cpp |
Adds unit tests for ACK/NACK and retransmission behavior. |
tests/tests/nuclearnet/Routing.cpp |
Adds unit tests for routing/subscription filtering. |
tests/tests/nuclearnet/RTTEstimator.cpp |
Adds unit tests for RTT estimator behavior. |
tests/tests/nuclearnet/wire_protocol.cpp |
Adds tests for packed sizes/layout and header validation. |
src/extension/network/NUClearNetwork.hpp |
Removes legacy network header. |
src/extension/network/NUClearNetwork.cpp |
Removes legacy network implementation. |
src/extension/network/wire_protocol.hpp |
Removes legacy wire protocol header. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- FileDescriptor.hpp: Add const to local variable (misc-const-correctness) - NUClearNet.hpp: Value-initialize iovec, add NOLINT for necessary const_cast, remove redundant member initializer - NetworkController.cpp: Add missing direct includes for string, set, Discovery.hpp, and NUClearNet.hpp (misc-include-cleaner) - TestRunner.cmake: Add --skip-returncode 0 so Catch2 returns success when all tests are skipped (fixes macOS CI where multicast is unavailable)
v5 is deprecated and has a known security vulnerability. The scanner was failing with HTTP 403 when querying JRE metadata, likely due to the old action version being unsupported by SonarCloud's API.
Build fixes: - Remove --skip-returncode (not supported in Catch2 v3.6.0) - sock_t.hpp: Add #include <string> for std::to_string/stoi (Windows) - Discovery.cpp: Fix include-cleaner errors and const-correctness Review feedback fixes: - NUClearNet.cpp: Fix unreachable new-peer branch by checking before process_announce adds the peer - Reliability.cpp: Validate ACK packet_count matches tracked packet - Discovery.cpp: Copy PeerInfo before invoking callbacks to avoid holding peers_mutex during user callbacks (deadlock risk) - wire_protocol.hpp: Fix ACK comment (10+ bytes, not 11+) and remove incorrect 'network byte order' claim from announce doc - has_multicast.cpp: Verify received payload matches sent message to avoid false positives from unrelated multicast traffic
Update all networking documentation to accurately describe the v2 implementation: - Protocol version 0x03 (was incorrectly documented as 0x02) - Modular architecture: Discovery, Fragmentation, Reliability, Routing, PacketDeduplicator, RTTEstimator - Jacobson/Karels RTT estimation (RFC 6298), not Kalman filter - Subscription-based routing (announce packets include type hashes) - Sliding-window packet deduplication (256 IDs per peer) - NAT-friendly port learning from UDP source address - Assembly size limits to prevent memory bombs - Configurable peer timeout and max retransmission attempts
Remove max_retransmits limit from Reliability module. Reliable messages now retransmit indefinitely based on RTT-estimated timeouts until either: - All fragments are ACKed (success) - The peer is removed due to timeout or graceful leave (connection lost) This provides true reliable delivery semantics — if the connection is alive, delivery is guaranteed. Also tie assembly timeout to peer_timeout (default 2s) instead of a fixed 10 seconds. The assembly timeout is now the natural bound: if no fragments arrive within the peer timeout period, the peer is either dead or the sender has moved on. For reliable messages, retransmissions keep the assembly alive as long as the sender is connected.
…e-back Since protocol v0.03 is already a breaking change, simplify the wire protocol by removing the DATA_RETRANSMISSION type. The receiver now checks the packet deduplicator for ALL incoming DATA packets. If the packet group was already fully processed, an ACK is sent and the fragment is discarded. This provides the same behavior with one fewer packet type. Packet type numbering is now: ANNOUNCE=1, LEAVE=2, DATA=3, ACK=4, NACK=5 Also document the announce-back behavior: when a node hears an announce from an unknown peer, it immediately sends its own announce via unicast to that peer. This gives instant bidirectional connection without waiting for the next announce cycle.
| /// This node's name on the network | ||
| std::string name; | ||
| /// The multicast/broadcast/unicast address to announce on | ||
| std::string announce_address = "239.226.152.162"; |
NACK was never sent by any code path — build_nack_packet existed but was never called in production. The bitset ACK already implicitly communicates which fragments are missing (bit=0), making an explicit NACK redundant. Packet types are now: ANNOUNCE=1, LEAVE=2, DATA=3, ACK=4
Connection establishment now requires two independent conditions: - announce_heard: peer's announce received on multicast channel - handshake == CONFIRMED: 3-way handshake over data ports This confirms all four communication paths (announce and data in both directions) before declaring a peer connected. Changes: - wire_protocol.hpp: Add CONNECT packet type (type=5) with SYN/ACK flags - Discovery.hpp/cpp: Replace single ConnectionState enum with two-flag model (announce_heard bool + HandshakeState enum) - NUClearNet.cpp: Force re-announce on new peer (multicast, not unicast), send CONNECT(SYN) to data port, gate DATA/ACK on is_connected() - Routing: Add is_locally_subscribed() for receiver-side filtering - NUClearNet send(): Unreliable broadcast sends go to multicast group instead of unicasting to each peer individually - Tests: Update Discovery tests for new model, add test for late announce scenario (data handshake completes before announce heard) - Docs: Rewrite connection establishment docs with sequence diagrams showing two-flag model, late announce, and multicast broadcast delivery
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the monolithic
NUClearNetworkimplementation insrc/extension/network/with a redesigned, modularsrc/nuclearnet/library. The new implementation is built as a standalone library that can be used independently of the reactor framework.Improvements over the old NUClearNetwork
nuclearnetcan now be built and linked independently of the NUClear reactor framework, enabling reuse in other projects.Architecture
The new NUClearNet is decomposed into focused components:
NUClearNetties these together to provide the public API (join, leave, send, process).Connection establishment
Peers must satisfy two independent conditions before a connection is considered "up":
announce_heard) — received the peer's announce on the multicast/broadcast channel, proving their data port can reach our announce address.handshake == CONFIRMED) — a 3-way SYN/SYN+ACK/ACK handshake over the data ports proves bidirectional unicast connectivity.The packet type encodes which path was used — ANNOUNCE packets always go to the multicast group, CONNECT packets always go to the peer's data port. This confirms all four communication paths without needing socket tracking.
When a new peer's announce is heard:
Multicast broadcast delivery
Unreliable broadcast sends (empty target, non-reliable) are sent once to the multicast/broadcast group rather than unicasting to each peer individually. Receivers filter by local subscription before fragmentation reassembly.
Reliable sends and targeted sends remain unicast for per-peer ACK tracking.
What Changed
src/extension/network/NUClearNetwork.{cpp,hpp}andsrc/extension/network/wire_protocol.hpp(old monolithic implementation)src/nuclearnet/— all new source and headers with its ownCMakeLists.txtsrc/util/network/sock_t.hpp— cross-platform socket type aliassrc/extension/NetworkController.{cpp,hpp}— updated to use the newnuclearnetlibrary APItests/tests/nuclearnet/— Catch2 BDD-style unit tests for each component (Discovery, Fragmentation, Integration, PacketDeduplicator, RTTEstimator, Reliability, Routing, wire_protocol)Build
nuclearnetlibrary links against the platform socket library and can be consumed standalone via CMake