Skip to content

feat(onion): control-plane onion routing (decision 20)#8

Merged
bodaay merged 3 commits into
mainfrom
feat/control-plane-onion
Jun 2, 2026
Merged

feat(onion): control-plane onion routing (decision 20)#8
bodaay merged 3 commits into
mainfrom
feat/control-plane-onion

Conversation

@bodaay

@bodaay bodaay commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

The onion mechanism (DESIGN decision 20) — relay-operator-resistant control-plane relaying.

  • Crypto core (onion pkg): single-pass setup — Build nests a layer per hop, each sealed to the relay's X25519 onion key via a fresh ephemeral ECDH; Peel opens one layer, revealing only the next hop + a session key. Data phase — Layer/Stack apply a per-hop ChaCha20 keystream per direction (length-preserving, no framing; integrity is end-to-end via the carried gRPC-mTLS/SSH).
  • Transport: Serve (relay: read setup, peel, forward, pump) + Open (coxswain: build, send setup, layered conn). Hop links are plain TCP — the onion supplies confidentiality; a setup not sealed to a relay is dropped.
  • CLI: beacon onion runs a hop; beacon onion-key mints/prints the relay's onion public key for enrolment.

Tested: setup path unwinding + wrong-key rejection, data-phase round-trips + chunked alignment, a full 3-hop circuit through an echo backend, and foreign-setup rejection. Proven live (NYC→FRA→SFO): each relay's onion port saw only its predecessor; gRPC routed through the circuit.

Consumed by PharosVPN/coxswain#33.

🤖 Generated with Claude Code

bodaay and others added 3 commits June 2, 2026 18:49
Decision 20's cryptographic core, pure + test-only-deps-free at the edges:

- Setup: Build nests a layer per hop, each sealed to the relay's X25519 onion
  key via a fresh ephemeral ECDH (K = HKDF(ECDH); ChaCha20-Poly1305, single-use
  zero nonce). Peel opens one layer with a relay's onion key, revealing only its
  next hop + its session key — never coxswain or the rest of the path.
- Data phase: Layer/Stack apply a per-hop ChaCha20 keystream per direction.
  XOR is length-preserving, so hops peel/add layers at aligned stream positions
  with no framing; integrity is end-to-end via the carried gRPC-mTLS/SSH.

Tests cover path unwinding, wrong-key rejection, forward+return round-trips, a
full Build→Peel→data circuit, and chunked keystream alignment.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The relay's X25519 onion key (decision 20): onion.LoadOrCreateKey mints and
persists it 0600 on first use; PublicKeyBase64/ParsePublicKey carry it over the
SSH-enrolment channel. `beacon onion-key` prints the public key on stdout
(diagnostics on stderr) for coxswain to capture and record, mirroring gen-csr.
The private key never leaves the host.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…onion`

The on-wire onion data path. A connection is `[u16 setup-len][setup][layered
data stream]`. Serve (relay) reads the framed setup, Peels its layer, dials the
next hop, forwards the inner setup to a middle hop, then pumps the stream both
ways through its layer (forward peels, return adds). Open (coxswain) builds the
circuit, sends the setup to the first hop, and returns a net.Conn that wraps
writes / unwraps reads in all layers. Hop links are plain TCP — the onion
supplies confidentiality, and a setup not sealed to a relay's key is dropped.

`beacon onion` runs a hop: a plain-TCP listener calling Serve with onion.key.

Tested: a 3-hop circuit round-trips bytes through an echo backend; a foreign
setup is rejected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@bodaay bodaay merged commit 2b403df into main Jun 2, 2026
1 of 2 checks passed
@bodaay bodaay deleted the feat/control-plane-onion branch June 2, 2026 16:16
bodaay added a commit to PharosVPN/coxswain that referenced this pull request Jun 2, 2026
beacon's onion package is on main now (PharosVPN/relay#8), so point the beacon
dependency at main (it carries both the egress and onion packages).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bodaay added a commit to PharosVPN/coxswain that referenced this pull request Jun 2, 2026
…hem (#33)

* feat(egress): enrol egress relays + route the control plane through them

Wires decision 19's mechanism into the cox CLI as a DB-enrolled relay:

- relays gain an egress_endpoint column (migration 00015) + fleet model field.
- `cox relays add --egress [--egress-port 8456]` signs the relay as before and
  additionally stages a beacon-egress.service running `beacon egress`, recording
  the egress endpoint (hostname:port — the signed hostname, so the tunnel TLS
  verifies) coxswain dials.
- The control-plane dialers self-configure from the DB: when an active remote
  relay carries an egress endpoint, newControlDialer routes gRPC through it
  (control.WithContextDialer) and dialNode/dialNew route SSH through it; no
  egress relay means direct dial (the default). coxswain presents its controller
  cert and verifies the relay against the root — the relay only needs a valid
  Fleet-CA client cert on this leg (otherwise protocol-blind).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(egress): N-relay chain so no single relay sees both ends

Ladder step 2 of decision 19. egress.NewChain composes per-hop tunnels:
each hop's mutual-TLS connection is dialed THROUGH the previous hop's tunnel
(nested), so the chain runs coxswain → relay0 → … → relayN → node. The first
relay sees coxswain's address but only learns the next hop; the last relay
reaches the node but its TCP peer is the previous relay, not coxswain — hence
no single relay sees both coxswain and the node.

newEgressTunnel now builds the chain from every active remote relay carrying an
egress endpoint, in enrollment order (hop 0 closest to coxswain). One relay is
a single hop (unchanged); two or more chain automatically.

Proven live (NYC relay1 → FRA relay2 → SFO node): node :8444 saw only relay2;
relay2's beacon-egress saw relay1, not coxswain; relay1 saw coxswain. A 2-hop
chain over mutual TLS at each hop is covered by an offline test too.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(egress): explicit, operator-controlled chain hop ordering

The egress chain order was derived from relay enrollment timestamps — fragile
(re-enrolling reshuffles it) and implicit. Add an `egress_hop` column
(migration 00016): a relay's 1-based position in the chain (hop 1 closest to
coxswain). `cox relays add --egress` auto-assigns the next hop, or `--egress-hop
N` pins one; newEgressTunnel builds the chain sorted by hop (ties by creation
order). `cox relays list` now shows each relay's egress endpoint + hop.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(egress): cox relays set-egress — reorder/drop a relay in the chain

Completes operator-controlled hop ordering: `cox relays set-egress <id>
--hop N` moves an enrolled egress relay to a chain position, and --disable
drops it from the chain (coxswain stops routing through it on the next dial;
the beacon-egress service keeps running until stopped). No more editing the
DB by hand to reorder.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: control-plane egress relaying operator guide (RELAYING.md)

The practical complement to DESIGN decision 19 — threat model recap, how the
protocol-blind chain works, the cox relays CLI (--egress / --egress-hop /
set-egress / list), a validation runbook (the live IP checks, with the
tcpdump-on-`any` SYN-filter gotcha noted), scope, and the test coverage.
Mirrors CASCADE.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(egress): coxswain onion routing — circuit dialer + relay onion enrolment

Decision 20 on the coxswain side, atop the egress chain. Relays gain
onion_endpoint + onion_pubkey (migration 00017); `cox relays add --egress
--onion` runs `beacon onion-key` to capture the relay's X25519 public key,
stages a beacon-onion.service, and records both. The control-plane dialer
(newEgressDialer) builds an onion circuit when every chain relay is
onion-capable — coxswain seals a layer to each relay's onion key and routes
gRPC + SSH through onion.Open, so each relay decrypts only its own layer —
otherwise it falls back to the nested-TLS chain (decision 19). `cox relays
list` shows an ONION column.

Bumps the beacon dep to the onion package (PharosVPN/beacon onion branch).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore(egress): bump beacon to merged onion package

beacon's onion package is on main now (PharosVPN/relay#8), so point the beacon
dependency at main (it carries both the egress and onion packages).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant