feat(network): programmable egress middleware with TLS MITM, streaming, and hardened agent auth#23
Open
ch99q wants to merge 17 commits into
Open
feat(network): programmable egress middleware with TLS MITM, streaming, and hardened agent auth#23ch99q wants to merge 17 commits into
ch99q wants to merge 17 commits into
Conversation
…g, and hardened agent auth Adds a network middleware layer so users can intercept, mutate, and stub sandbox HTTP traffic from the SDK. Credentials injected via rules live on the daemon — they never enter the sandbox process memory. Wire protocol additions: - OpNetEgress / OpNetEgressStream: agent forwards each sandbox HTTP request to the daemon over the existing agent connection via a new bidirectional RPC client (agent/phonehome) - OpNetRulesSet: SDK installs declarative rules via session hook - OpNetCertLeaf / OpNetCAInstall: per-sandbox ECDSA P-256 CA; agent fetches short-lived (24h TTL, LRU-cached) leaf certs for HTTPS MITM - OpNetDefer: daemon -> SDK round-trip for programmatic handlers - OpAuthBootstrap: single-use nonce exchange for long-lived auth token - FrameStreamData / FrameStreamEnd: multiplexed body streaming so SSE and chunked responses flow back to the sandbox token-by-token Security model: - Transport-layer peer auth: vsock listener accepts only VMADDR_CID_HOST, TCP listener rejects loopback (blocks in-sandbox processes from reaching the agent socket) - Long-lived auth token never appears on kernel cmdline or container env; only a single-use bootstrap nonce is delivered to the guest, consumed on the first daemon connection - TLS MITM uses per-sandbox CA; CA private key stays on daemon, only short-lived per-host leaf keys ever reach agent memory - SSRF-hardened egress dialer: resolves hostname itself, refuses any IP in private/loopback/link-local/CGNAT (incl. 169.254.169.254 IMDS), dials by IP literal to defeat DNS rebinding; default-deny private, per-rule opt-in via context value - HTTPS intercepted transparently via mounted CA bundle (NODE_EXTRA_CA_CERTS, REQUESTS_CA_BUNDLE, CURL_CA_BUNDLE, etc.) SDKs: - TypeScript: sbx.network.intercept/allow/deny/inject, defer handlers with full request/response programmability - Go: sbx.Net().Intercept/Allow/Deny/Inject mirrors the TS API - Both honor an opt-out (network.enabled=false / NetworkConfig.Disabled) so sandboxes that don't need the proxy skip the env injection entirely Other changes: - agent/net.Fetch removed; SDK net.fetch always routes through the daemon's egress handler (one private-IP guard, not two) - Per-sandbox state cleanup via Sandbox.OnDestroy callback list, so new subsystems can register their teardown without touching destroy paths - Frame Request/Response probe extracted to protocol package, used by both agent and daemon dispatchers - Container env builder is now a typed value (containerEnv) shared by Docker and Firecracker runtimes Tests added covering: suffix-trie rule matcher, CA gen + leaf signing, nonce-bootstrap auth (incl. replay/race), peer-auth listener filters, frame probe, SSRF dialer (DNS resolution + private IP rejection), streaming chunked upstream, token-leak anti-regression, defer wire format, defer handler dispatch by rule ID. Docs: README, SECURITY.md, and sdk/typescript/README updated with the network-middleware section, threat model, and agent-auth section.
…eStreaming Both entry points duplicated ~120 lines of URL parsing, rule lookup, deny/defer dispatch, header processing, inject mutations, private-host fast-path, and http.Request construction. Extract that into prepareUpstream() returning either a ready-to-dial *http.Request or a synthetic short-circuit response. HandleStreaming converts a buffered synthetic response into the streaming shape (head + one-chunk copier) via the new syntheticAsStream helper. Net: 188 lines removed, 112 added. Behavior unchanged — all existing tests (incl. defer mutation, defer short-circuit, deny, inject, streaming chunked upstream, SSRF block on resolved hostname) still pass.
…tream teardown) #1 Warm-pool containers started without any auth credential. internal/pool/pool.go now generates the bootstrap nonce in addition to the long-lived token and passes both to rt.Create, so the env builder emits SANDBOX_AUTH_BOOTSTRAP and the readiness loop's bootstrap call succeeds. #2 Warm-pool sandboxes bypassed egress rules. The pool now creates warm containers with EgressProxy: true so HTTP_PROXY env vars are present from the start (can't be added to a running container). daemon.CreateSandbox skips the pool when the sandbox opted into NetworkMode=off, forcing a cold start in that case rather than handing back a misconfigured warm container. #3 Bootstrap retries couldn't recover after the nonce was consumed. Docker and Firecracker readiness loops now track a `bootstrapped` bool across iterations; after the first successful Bootstrap, any retry (e.g. transient CA-install or ping failure) switches to auth.token against the now-installed long-lived token rather than re-attempting bootstrap with a spent nonce. #4 streams.closeAll could leave HTTP proxy requests hanging. The old non-blocking send for the terminal frame silently dropped it when the channel buffer was full and never closed the channel, so the consumer's `for chunk := range ch` loop hung forever. closeAll now block-sends the terminal (with a 100ms deadline so buffered chunks have a chance to drain) and then closes the channel, guaranteeing the consumer always either sees the error marker or exits cleanly on EOF. Regression test added for #4 (TestCloseAllUnblocksConsumerEvenWithFullBuffer).
…n tests that need daemon-reachable upstream Two fixes: 1) Silent truncation when closeWithTerminal times out. Reviewer finding: closeWithTerminal's 100ms blocking-send has a fallback that closes the channel WITHOUT delivering a terminal frame. streamBodyTo treated channel-close as a clean EOF and returned nil, so the HTTP proxy handler wrote the chunked terminator and the sandbox client saw the truncated body as a successful response. streamBodyTo now returns errStreamClosedWithoutTerminal in that case. handlePlain panics with http.ErrAbortHandler so net/http closes the TCP conn without writing a clean terminator; the sandbox client sees an abnormal EOF. writeStreamTLSResponse already short-circuits on error (skips cw.Close()), so the TLS path was correct once streamBodyTo started signaling truncation. Tests: TestStreamBodyToCleanEOF, TestStreamBodyToCarriesError, TestStreamBodyToTruncationOnCloseWithoutTerminal, TestStreamBodyToWriteFailureStops. 2) CI integration test reachability. TestNetworkInjectHeader and the TS inject/defer tests use httptest.NewServer (which binds host 127.0.0.1) as a controlled upstream and have the sandbox dial back to it through the daemon. In containerized CI the daemon runs in Docker, so its 127.0.0.1 is its own loopback — not the test process — and the dial fails with 502. These tests are now gated behind SANDBOXD_INTEGRATION_NETWORK=1 with a clear skip message. Set the env var when running locally with a daemon that shares a network namespace with the test process. CI cleanly skips rather than failing. The deny test in both suites isn't gated — it short-circuits before any dial and works fine in CI. README: minor accuracy fix listing all the trust-bundle env vars the daemon actually exports (SSL_CERT_FILE was missing).
…nject collision, opt-out, redirect, streaming timeout, hung deliver) #1 Network opt-out still bound the agent egress proxy. startEgressProxy now returns without binding when neither SANDBOX_EGRESS_ADDR nor SANDBOX_EGRESS_PORT is set. network_mode= off is a true opt-out — the agent no longer claims 127.0.0.1:8118. #2 Streaming responses were killed by the 60s http.Client.Timeout. Added a second http.Client (streamClient) that shares the transport pool but has no overall Timeout. HandleStreaming uses it; SSE and long downloads now cancel only via request context. #3 Deferred handler headers failed JSON round-trip. SDK exposes headers as map[string]string; wire format is map[string][]string. Daemon's json.Unmarshal of the SDK's reply into protocol.DeferResponse failed on any header → 502. Both SDKs now collapse incoming wire headers into single-value for the user's handler and expand back to multi-value before returning. Updated TestDispatchDeferRoutesByRuleID for the new shape. #4 Inject could be overridden by sandbox case collisions. SetHeaders wrote the inject key without removing existing case-variants. After http.CanonicalHeaderKey collapsed both at write time, randomized map iteration order picked the winner — sandbox-supplied "authorization" beat daemon-injected "Authorization" about half the time. inject loop now canonicalizes and removes any colliding case-variant before setting. Same fix applied to RemoveHeaders. Regression test runs 200 iterations to defeat the random order: daemon value MUST win every time. #5 Concurrent writes on agent connRW could corrupt the protocol. phonehome client, dispatcher, and auth handlers all wrote to the same connRW concurrently with no serialization. With non-atomic TCP writes under pressure, frame bytes could interleave. Added a write mutex on connRW so every codec.WriteFrame reaches the wire as one contiguous byte sequence. #6 deliver could hang the agent read loop after release. deliver did a blocking send AFTER releasing the registry lock; if the HTTP client disconnected and release fired while the send was blocked on a full buffer, the goroutine hung forever and stalled the daemon connection. Added a per-stream done channel; deliver selects on it so release immediately unblocks any pending send. doneOnce guards close(done) since both release and closeAll race to fire it. #7 net.fetch no longer followed redirects. serveSDKFetch routed through the egress handler whose CheckRedirect returns http.ErrUseLastResponse (egress proxy needs raw 3xx). Added a per-context allow-follow flag (netegress.WithFollowRedirects) that serveSDKFetch sets; CheckRedirect honors it (up to 10 hops). Egress proxy path still sees raw redirects. Tests added/updated: TestInjectWinsAgainstSandboxCaseCollision (200 iterations against random map order), TestDispatchDeferRoutesByRuleID (new wire shape), TestCloseAllUnblocksConsumerEvenWithFullBuffer (updated for done channel).
…ress enforcement #1 Encrypted sandboxes silently dropped network rules. net.rules.set and net.fetch are claimed on the daemon side, but E2E-encrypted SDK transports wrap params as {"_encrypted": "..."}. The daemon doesn't have the session key (that's the point of E2E), so unmarshalling into the expected param shape silently produced zero-valued fields — net.rules.set cleared every rule for the sandbox, net.fetch dialed an empty URL. serveRulesSet and serveSDKFetch now call rejectEncryptedParams which probes for the _encrypted wrapper and returns a clear error naming the method. The SDKs additionally refuse the combination at createSandbox time (network + encrypted=true) so the user gets a fast-fail rather than a 502 round-trip. Regression test: TestServeRulesSetRefusesEncrypted asserts the error path; TestRejectEncryptedParams covers the helper directly. #2 HTTP_PROXY enforcement is cooperative — strengthen the docs. Rules apply to clients that honor HTTP_PROXY / HTTPS_PROXY, which covers most real-world libraries (Node fetch, Python requests, Go net/http, curl). Code that opens raw sockets or ignores proxy env bypasses the middleware entirely; the sandbox is on a normal Docker bridge with outbound internet, so direct dials succeed. SECURITY.md, README.md, and sdk/typescript/README.md now state this explicitly with the recommended mitigation (egress-restricted netns / iptables REDIRECT). No code change — network-level enforcement is tracked as a follow-up.
#1 Encrypted net.fetch now always returned 502. MethodSet now receives (method, params); clientLocalMethodsFor declines to claim net.fetch when params carry the _encrypted envelope, so the frame forwards to the agent's E2E-decrypted handler. Restored agent/net/net.go (uses protocol.IsPrivateHost so it doesn't drift from the daemon's SSRF guard). Rule application doesn't happen for encrypted sessions; the SDKs throw at create time when the user requests both. #2 Failed network install left an orphan sandbox on the daemon. Both SDKs now wrap the post-create network install in cleanup: close the transport so the daemon notices, DELETE /sandboxes/{id} so the sandbox doesn't linger. Cleanup errors are swallowed. Test: TestClientLocalMethodsForwardsEncryptedFetch asserts plain fetch is claimed and encrypted fetch is forwarded.
…tream teardown) #1 CA + private key leaked on cold-start failure. EnsureCA ran BEFORE rt.Create. If rt.Create failed, the sandbox was never registered, OnDestroy never fired, and caBySbx held the CA forever. daemon.CreateSandbox now ClearRules(sbxID) on both rt.Create failure AND Registry.Add failure (the latter also destroys the runtime instance to keep things symmetric). #2 Opt-out + rules silently still installed rules. network.enabled=false / Network.Disabled=true with egress rules produced an incoherent state: proxy/CA wiring off, but rules still pushed to the daemon and applied to net.fetch. Both SDKs now throw at createSandbox time when both are set — caller must remove one or the other. #3 destroy tombstones grew without bound. ClearRules added each sandbox ID to destroyed{} permanently; a long-running daemon accumulated one entry per historical sandbox. Tombstones only need to outlast any in-flight CONNECT or stream request — generous 30s grace (DefaultTombstoneTTL) is plenty, after which time.AfterFunc removes the entry. Sandbox IDs are random 8-byte hex so collision within the grace window is negligible. Tests can override the duration via SetTombstoneTTL. Test: TestTombstoneExpiresAfterTTL exercises the new sweep with a 50ms TTL.
…tream teardown) #1 CRITICAL: bad bootstrap nonce disabled all subsequent auth. handleBootstrap consumed the nonce BEFORE constant-time validation. Any caller racing the daemon with a wrong guess permanently cleared the nonce; authToken was never set; authConfigured() returned false; subsequent connections were accepted UNAUTHENTICATED. Split into peekBootstrap (read without clearing) and consumeBootstrap (clear, called ONLY after successful validation). Real daemon retry now succeeds after a bad guess. Test: TestAuthBootstrapBadGuessDoesNotDisableAuth races a bad bootstrap then the real one; asserts auth stays required and the correct token is installed. #2 Reconnect race wiped the new connection's egress client. handleConn's defer unconditionally SetClient(nil) and CloseStreams. On session-replace, the new handleConn's SetClient(phoneClient) ran before the old defer; the old defer then stomped the new client, leaving the proxy with "no active daemon connection" until another reconnect. New ClearClientIf(expected) uses atomic CompareAndSwap so the defer only tears down state if it still owned the client. #3 Cleanup DELETE used stale POST headers under per-request signing. When the post-create network install failed, cleanup sent DELETE /sandboxes/{id} with headers resolved for POST /sandboxes. With per-request signing those headers don't validate for the DELETE path; cleanup silently failed and the sandbox was orphaned. Both SDKs now re-resolve auth for the DELETE method+path before sending. Plus: trailing blank line at EOF of internal/netegress/egress.go.
- sandboxca: drop existing cache entry before PushFront so concurrent
misses for the same host don't orphan an LRU node and slowly leak.
- proxy: honor caller ctx cancellation in Session.CallClient select.
- ts sdk: validate rules in intercept() before mutating local mirror so
a bad call doesn't poison subsequent allow/deny/inject/defer calls.
- ts sdk: use monotonic deferSeq for auto-generated defer ids; the
prior `defer-${current.length}` could collide with user-supplied ids.
- egressproxy: per-connection stream teardown (CloseStreamsForClient)
so a stale handleConn defer doesn't kill a fresh reconnect's streams.
- netegress: re-evaluate egress rules on every redirect hop and don't
carry allow-private context into an unmatched redirect target.
- ts sdk: treat null/undefined/empty defer-handler return as "continue with original request" (matches Go SDK); previously crashed with a TypeError on undefined property access. - ts sdk: snapshot+restore current on push() failure in deny/allow/ inject/defer so a thrown daemon RPC doesn't leave the local mirror diverged from what's actually installed. - netrules: detect /regex/ form on the raw host string before lowercasing — otherwise negated escape classes (\D, \S, \W) get flipped and [A-Z] folds to [a-z]. Wrap with (?i) since Match() lowercases the host before comparing. - netegress: re-validate scheme after a defer handler rewrites the request URL; without this, a handler can return file:// or any other scheme and bypass the initial http/https guard. - phonehome: re-check closed after pending.Store so a Close() that raced past the initial check doesn't leave a stranded entry that blocks until timeout (up to 60s, leaking goroutine + map entry).
- cabundle: seek+truncate the merged bundle on partial io.Copy failure so a half-written PEM isn't concatenated with the next candidate or the sandbox CA. - egressproxy: per-iteration read deadline on the TLS CONNECT keep- alive loop. The outer ReadHeaderTimeout only covers the initial plain CONNECT; without this, a stalled sandbox client would hold the handler goroutine indefinitely. - netegress: look up the Host header via httpReq.Header.Get (canonical key) instead of literal map["Host"], matching how the inject loop above populates the header map. - peer_auth: drop the unused errRejected sentinel (and its errors import) — it was never returned anywhere.
- egressproxy: bound the post-CONNECT TLS handshake with a deadline on the underlying conn. Hijack drops the server's ctx-driven cancel, so a stalled sandbox client could otherwise pin the goroutine/FD. - agent server: route pong writes through the connRW wrapper so they serialize with phonehome / streaming / JSON-RPC frames instead of interleaving on the raw conn. - daemon: use a detached, time-bounded context for rt.Destroy on the Registry.Add and warm-pool rollback paths. The caller's ctx is exactly what would be cancelled when these rollbacks fire, and a cancelled Destroy leaks container + agent + ports. - daemon: gate serveLeafCert on sandboxEgressEnabled so network_mode= off sandboxes can't lazily provision per-sandbox CAs via OpNetCertLeaf. - daemon: fail closed in sandboxEgressEnabled when the sandbox isn't in the registry. Returning true let unknown-ID RPCs proceed as if egress were on. - netegress: re-match rules after a defer handler rewrites the URL to a different host. The original rule's allow-private grant (and inject, if the rule were an inject) must not carry to the new host; a new defer match is treated as no-rule to avoid handler recursion. - ws_transport (go sdk): when the registered handler returns (nil, nil) emit a -32601 method-not-found error instead of a response with neither result nor error (invalid JSON-RPC 2.0). - ts sdk: intercept() now snapshots and rolls back via pushOrRollback on push() failure, matching deny/allow/inject/defer. - proxy: stop the per-CallClient timer when the select picks another case. time.After leaked the timer until expiry (up to 60s per defer RPC).
- sdk go: snapshot rules+handlers before mutation in Intercept and appendRule; restore on push failure so a daemon-rejected RPC doesn't leave the local mirror diverged from the installed state (mirrors the TS pushOrRollback path). - phonehome: use time.NewTimer + defer timer.Stop in Call instead of time.After so timers don't leak a goroutine + Timer struct per egress dispatch (up to 60s tail). Same pattern as proxy.CallClient.
- egressproxy: strip Content-Length/Transfer-Encoding in handlePlain (matches the TLS path) so a rule-rewritten or truncated body doesn't force fixed-length mode and hang the sandbox client. - egressproxy: per-write deadline on the TLS chunked response via a deadlineWriter wrapper, so a stalled sandbox client can't pin tlsConn.Write — which would block the handler, prevent release(), fill the stream channel, and stall the agent's read loop. - proxy: always invoke afterWrite even on writeLocalResponse failure so the upstream resp.Body still gets closed via the copier's defer. - netegress: when a redirect crosses host boundaries, strip the previous rule's inject headers from req.Header and re-apply the new rule's inject. Without this, X-API-Key-style creds leaked to the redirected origin. - sdk go/ts: auto-generated defer IDs now use a __sdk_defer_ sentinel prefix so they can't collide with user-supplied "defer-N" IDs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a network middleware layer so users can intercept, mutate, or stub sandbox HTTP traffic from the SDK (
sbx.network.intercept/allow/deny/injectwith programmatic defer-to-sdk handlers); credentials injected via rules live on the daemon and never enter the sandbox process. Includes per-sandbox ECDSA P-256 CA with on-demand leaf signing for transparent HTTPS interception, streaming response bodies for SSE/LLM use cases, and a SSRF-hardened egress dialer that resolves hostnames itself and refuses private/metadata addresses (defeats DNS rebinding to IMDS).Hardens agent ↔ daemon auth via transport-layer peer auth (vsock CID check, TCP non-loopback) plus a single-use nonce-bootstrap pattern so the long-lived token never appears on kernel cmdline or container env. Removes the parallel
agent/net.Fetchpath so there's one egress code path with one private-IP guard, and addsnetwork.enabled = falseas an opt-out for sandboxes that don't need the proxy.Mirrors the API in both TypeScript and Go SDKs.
Test plan
go test ./...,bun run test) — incl. suffix-trie matcher, CA gen + leaf signing, nonce-bootstrap (happy path + bad nonce + post-consumption + token-before-bootstrap), peer-auth listener, SSRF dialer with DNS resolution, streaming chunked upstream, token-leak anti-regression, defer wire formatSANDBOXD_ENDPOINT=...) for inject/deny/defer in both Go and TS suitesinternal/netegress/egress.goandagent/egressproxy/proxy.go(the two hottest security-critical files)