feat(onion): control-plane onion routing (decision 20)#8
Merged
Conversation
Decision 20's cryptographic core, pure + test-only-deps-free at the edges: - Setup: Build nests a layer per hop, each sealed to the relay's X25519 onion key via a fresh ephemeral ECDH (K = HKDF(ECDH); ChaCha20-Poly1305, single-use zero nonce). Peel opens one layer with a relay's onion key, revealing only its next hop + its session key — never coxswain or the rest of the path. - Data phase: Layer/Stack apply a per-hop ChaCha20 keystream per direction. XOR is length-preserving, so hops peel/add layers at aligned stream positions with no framing; integrity is end-to-end via the carried gRPC-mTLS/SSH. Tests cover path unwinding, wrong-key rejection, forward+return round-trips, a full Build→Peel→data circuit, and chunked keystream alignment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The relay's X25519 onion key (decision 20): onion.LoadOrCreateKey mints and persists it 0600 on first use; PublicKeyBase64/ParsePublicKey carry it over the SSH-enrolment channel. `beacon onion-key` prints the public key on stdout (diagnostics on stderr) for coxswain to capture and record, mirroring gen-csr. The private key never leaves the host. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…onion` The on-wire onion data path. A connection is `[u16 setup-len][setup][layered data stream]`. Serve (relay) reads the framed setup, Peels its layer, dials the next hop, forwards the inner setup to a middle hop, then pumps the stream both ways through its layer (forward peels, return adds). Open (coxswain) builds the circuit, sends the setup to the first hop, and returns a net.Conn that wraps writes / unwraps reads in all layers. Hop links are plain TCP — the onion supplies confidentiality, and a setup not sealed to a relay's key is dropped. `beacon onion` runs a hop: a plain-TCP listener calling Serve with onion.key. Tested: a 3-hop circuit round-trips bytes through an echo backend; a foreign setup is rejected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bodaay
added a commit
to PharosVPN/coxswain
that referenced
this pull request
Jun 2, 2026
beacon's onion package is on main now (PharosVPN/relay#8), so point the beacon dependency at main (it carries both the egress and onion packages). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bodaay
added a commit
to PharosVPN/coxswain
that referenced
this pull request
Jun 2, 2026
…hem (#33) * feat(egress): enrol egress relays + route the control plane through them Wires decision 19's mechanism into the cox CLI as a DB-enrolled relay: - relays gain an egress_endpoint column (migration 00015) + fleet model field. - `cox relays add --egress [--egress-port 8456]` signs the relay as before and additionally stages a beacon-egress.service running `beacon egress`, recording the egress endpoint (hostname:port — the signed hostname, so the tunnel TLS verifies) coxswain dials. - The control-plane dialers self-configure from the DB: when an active remote relay carries an egress endpoint, newControlDialer routes gRPC through it (control.WithContextDialer) and dialNode/dialNew route SSH through it; no egress relay means direct dial (the default). coxswain presents its controller cert and verifies the relay against the root — the relay only needs a valid Fleet-CA client cert on this leg (otherwise protocol-blind). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(egress): N-relay chain so no single relay sees both ends Ladder step 2 of decision 19. egress.NewChain composes per-hop tunnels: each hop's mutual-TLS connection is dialed THROUGH the previous hop's tunnel (nested), so the chain runs coxswain → relay0 → … → relayN → node. The first relay sees coxswain's address but only learns the next hop; the last relay reaches the node but its TCP peer is the previous relay, not coxswain — hence no single relay sees both coxswain and the node. newEgressTunnel now builds the chain from every active remote relay carrying an egress endpoint, in enrollment order (hop 0 closest to coxswain). One relay is a single hop (unchanged); two or more chain automatically. Proven live (NYC relay1 → FRA relay2 → SFO node): node :8444 saw only relay2; relay2's beacon-egress saw relay1, not coxswain; relay1 saw coxswain. A 2-hop chain over mutual TLS at each hop is covered by an offline test too. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(egress): explicit, operator-controlled chain hop ordering The egress chain order was derived from relay enrollment timestamps — fragile (re-enrolling reshuffles it) and implicit. Add an `egress_hop` column (migration 00016): a relay's 1-based position in the chain (hop 1 closest to coxswain). `cox relays add --egress` auto-assigns the next hop, or `--egress-hop N` pins one; newEgressTunnel builds the chain sorted by hop (ties by creation order). `cox relays list` now shows each relay's egress endpoint + hop. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(egress): cox relays set-egress — reorder/drop a relay in the chain Completes operator-controlled hop ordering: `cox relays set-egress <id> --hop N` moves an enrolled egress relay to a chain position, and --disable drops it from the chain (coxswain stops routing through it on the next dial; the beacon-egress service keeps running until stopped). No more editing the DB by hand to reorder. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs: control-plane egress relaying operator guide (RELAYING.md) The practical complement to DESIGN decision 19 — threat model recap, how the protocol-blind chain works, the cox relays CLI (--egress / --egress-hop / set-egress / list), a validation runbook (the live IP checks, with the tcpdump-on-`any` SYN-filter gotcha noted), scope, and the test coverage. Mirrors CASCADE.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(egress): coxswain onion routing — circuit dialer + relay onion enrolment Decision 20 on the coxswain side, atop the egress chain. Relays gain onion_endpoint + onion_pubkey (migration 00017); `cox relays add --egress --onion` runs `beacon onion-key` to capture the relay's X25519 public key, stages a beacon-onion.service, and records both. The control-plane dialer (newEgressDialer) builds an onion circuit when every chain relay is onion-capable — coxswain seals a layer to each relay's onion key and routes gRPC + SSH through onion.Open, so each relay decrypts only its own layer — otherwise it falls back to the nested-TLS chain (decision 19). `cox relays list` shows an ONION column. Bumps the beacon dep to the onion package (PharosVPN/beacon onion branch). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore(egress): bump beacon to merged onion package beacon's onion package is on main now (PharosVPN/relay#8), so point the beacon dependency at main (it carries both the egress and onion packages). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The onion mechanism (DESIGN decision 20) — relay-operator-resistant control-plane relaying.
onionpkg): single-pass setup —Buildnests a layer per hop, each sealed to the relay's X25519 onion key via a fresh ephemeral ECDH;Peelopens one layer, revealing only the next hop + a session key. Data phase —Layer/Stackapply a per-hop ChaCha20 keystream per direction (length-preserving, no framing; integrity is end-to-end via the carried gRPC-mTLS/SSH).Serve(relay: read setup, peel, forward, pump) +Open(coxswain: build, send setup, layered conn). Hop links are plain TCP — the onion supplies confidentiality; a setup not sealed to a relay is dropped.beacon onionruns a hop;beacon onion-keymints/prints the relay's onion public key for enrolment.Tested: setup path unwinding + wrong-key rejection, data-phase round-trips + chunked alignment, a full 3-hop circuit through an echo backend, and foreign-setup rejection. Proven live (NYC→FRA→SFO): each relay's onion port saw only its predecessor; gRPC routed through the circuit.
Consumed by PharosVPN/coxswain#33.
🤖 Generated with Claude Code