Skip to content

feat(auth): per-sandbox authentication to gateway#1404

Open
TaylorMutch wants to merge 21 commits into
mainfrom
tmutch/per-supervisor-authn
Open

feat(auth): per-sandbox authentication to gateway#1404
TaylorMutch wants to merge 21 commits into
mainfrom
tmutch/per-supervisor-authn

Conversation

@TaylorMutch
Copy link
Copy Markdown
Collaborator

@TaylorMutch TaylorMutch commented May 15, 2026

Summary

Adds per-sandbox supervisor authentication for gateway RPCs and closes the
cross-sandbox access gap tracked in #1354. Sandbox supervisors now authenticate
as a specific Principal::Sandbox, and gateway handlers authorize access by
comparing that authenticated principal to the sandbox named in each
sandbox-scoped request.

The implementation has two bootstrap paths:

  • Docker, Podman, and VM sandboxes receive gateway-minted JWT bootstrap material
    through driver-managed supervisor secret files or guest secret material.
  • Kubernetes sandboxes exchange a projected, pod-bound ServiceAccount token for
    the same kind of gateway-minted JWT. The gateway validates the projected token
    with Kubernetes TokenReview, requires the configured sandbox ServiceAccount
    in the sandbox namespace, checks the pod name and UID, fetches the live pod,
    and reads the gateway-owned openshell.io/sandbox-id annotation.

After bootstrap, all drivers converge on the same steady state: the supervisor
presents Authorization: Bearer <gateway-jwt>, refreshes that credential in
memory, and is authorized only for its own sandbox.

Related Issue

Closes #1354

Changes

  • Introduces Authenticator/Principal routing for gateway gRPC
    authentication.
  • Adds gateway-minted sandbox JWT signing, validation, and refresh support.
  • Adds Docker, Podman, and VM bootstrap plumbing that delivers supervisor-only
    JWT files without exposing tokens through public APIs or user entrypoint
    environments.
  • Adds Kubernetes ServiceAccount token bootstrap through IssueSandboxToken
    using the Kubernetes TokenReview API.
  • Provisions and configures a Helm-managed sandbox ServiceAccount for sandbox
    pods, with support for using an existing ServiceAccount.
  • Configures the Kubernetes compute driver with the sandbox ServiceAccount name
    and sets it on sandbox pods while keeping automatic ServiceAccount token
    mounting disabled.
  • Restricts Kubernetes bootstrap to the configured sandbox ServiceAccount and
    the configured sandbox namespace.
  • Updates the supervisor gRPC client to acquire a bearer credential at startup
    and inject it on every gateway call.
  • Enforces per-handler sandbox ID equality for sandbox-scoped RPCs.
  • Pins PushSandboxLogs to the first validated sandbox ID in the stream and
    rejects later frames that try to switch sandbox identity.
  • Requires persisted sandbox records before IssueSandboxToken or
    RefreshSandboxToken mint a token.
  • Adds sandbox debug-rpc helpers for end-to-end authentication testing.
  • Mounts sandbox JWT keys in Helm deployments even when local TLS is disabled.
  • Updates helm-dev k3d setup to preload the default community sandbox image to
    speed up Kubernetes e2e smoke tests.
  • Updates docs, Helm chart tests, and debugging guidance for the new
    per-sandbox identity model.

Implementation Details

Problem Context

Before this PR, sandbox-class handlers trusted a sandbox_id or sandbox name
supplied in the request body. The shared mTLS client certificate only proved
that the caller had a gateway client certificate; it did not prove that the
caller was sandbox A rather than sandbox B. Any holder of that shared credential
could therefore ask for another sandbox's policy, drafts, provider environment,
or related sandbox-private state.

This PR moves the identity decision into the gateway authentication layer. The
router authenticates the caller, inserts a Principal into request extensions,
and handlers compare that principal to the requested sandbox before serving
sandbox-private data.

Shared Gateway Auth Model

The gateway now uses a pluggable authenticator chain. Each authenticator can
produce a Principal, decline so the next authenticator can try, or reject the
request fail-closed.

The steady-state sandbox credential is a gateway-minted Ed25519 JWT. Validation
checks issuer, audience, key ID, expiry, algorithm, and sandbox identity. The
token is intentionally short lived. Refresh mints a replacement for the same
sandbox principal, and older tokens remain valid only until their own expiry.

This JWT is supervisor identity material:

  • It is not returned in CreateSandboxResponse.
  • It is not stored in public sandbox metadata.
  • It is not logged.
  • It is kept out of ordinary user entrypoint environments.

Docker, Podman, And VM Bootstrap

Docker, Podman, and VM deployments do not have a platform identity service
equivalent to Kubernetes projected ServiceAccount tokens. For those drivers, the
gateway uses a push-based bootstrap pattern.

At sandbox creation time, the gateway mints a sandbox JWT for the new sandbox
and passes it to the in-process driver boundary as secret material. The driver
writes that token to a supervisor-only file, or VM guest secret material, and
starts the sandbox with OPENSHELL_SANDBOX_TOKEN_FILE pointing at that file.
The supervisor reads the file once at startup and then keeps the active token in
memory.

This path avoids the unsafe parts of the old model:

  • The raw token does not cross the public gRPC API.
  • The token is not placed in the user command environment.
  • The token is scoped to one sandbox ID.
  • Refresh rotates the in-memory bearer token without rewriting bootstrap
    material.

Kubernetes Bootstrap

Kubernetes uses a pull-based bootstrap pattern because kubelet can provide a
short-lived, audience-bound, pod-bound ServiceAccount token to the sandbox pod.

The sandbox pod gets a projected ServiceAccount token mounted at a
supervisor-only path. On startup, the supervisor presents that token to
IssueSandboxToken. The gateway validates the token with Kubernetes
TokenReview, verifies the accepted audience, requires the exact configured
sandbox ServiceAccount username, extracts the bound pod name and UID, fetches
the live pod from the sandbox namespace, checks the UID, and reads the
openshell.io/sandbox-id annotation to derive the sandbox identity.

The Helm chart now creates a dedicated sandbox ServiceAccount by default and
passes its name into the gateway's Kubernetes driver configuration. Operators
can disable creation and provide an existing ServiceAccount name. Sandbox pods
continue to set automountServiceAccountToken: false; the only token made
available to the supervisor is the explicit projected token used for bootstrap.

Handler Authorization

Authentication alone is not enough; handlers still need to authorize access to
the requested sandbox.

Direct sandbox_id handlers compare the authenticated
Principal::Sandbox.sandbox_id to the requested ID. Name-keyed handlers resolve
the sandbox name to the canonical ID and then compare. PushSandboxLogs
authorizes the first non-empty batch, verifies the sandbox still exists, stores
that sandbox ID for the stream, and rejects any later batch that names a
different sandbox.

User principals continue through the normal RBAC path. Sandbox principals are
limited to their own sandbox. Anonymous principals are rejected for
sandbox-scoped paths.

Token Lifecycle

IssueSandboxToken is only available to Kubernetes ServiceAccount bootstrap
principals. RefreshSandboxToken is only available to supervisors already
holding a gateway-minted JWT. Both paths require the sandbox record to still
exist before minting a token, so deleted or unknown sandboxes cannot keep
refreshing credentials.

Kubernetes supervisors can recover from restart by repeating the ServiceAccount
bootstrap exchange. Docker, Podman, and VM supervisors use their file or guest
secret bootstrap material and then rely on in-memory refresh for steady state.

Signing Key Persistence

The gateway JWT signing key is persisted through the existing local and Helm
PKI paths. Helm mounts the JWT key material into the gateway even when local TLS
is disabled, because per-sandbox authentication is independent from TLS
enablement.

Design Decisions For Reviewers

  • Two bootstrap patterns, one steady-state credential: local and VM drivers push
    supervisor-only bootstrap material; Kubernetes pulls a token through
    ServiceAccount exchange. Both become the same gateway JWT.
  • Kubernetes uses TokenReview instead of in-gateway JWT verification so the
    apiserver remains the source of truth for projected ServiceAccount token
    validity and audience acceptance.
  • The Helm chart provisions a sandbox ServiceAccount by default rather than
    creating per-sandbox ServiceAccounts in this PR.
  • No per-sandbox Kubernetes Secret objects are created for bootstrap.
  • No raw token is exposed through public APIs, CreateSandboxResponse, sandbox
    metadata, ordinary user environments, or logs.
  • mTLS remains transport protection, not sandbox identity.
  • Handler checks are explicit because handlers know which request field
    identifies the target sandbox.

Reviewer Focus Areas

  • Docker, Podman, and VM token file handling: supervisor-only placement, no
    leakage into entrypoint environment, and cleanup behavior.
  • Kubernetes bootstrap validation: TokenReview, accepted audience, configured
    ServiceAccount, pod name/UID binding, annotation handling, and RBAC scope.
  • Handler coverage: every sandbox-private RPC should either call the
    sandbox-scope guard or have a documented reason not to.
  • Streaming RPC behavior: PushSandboxLogs should not allow a stream to change
    sandbox identity after validation.
  • Signing key persistence: local and Helm deployments must preserve the JWT key
    across gateway restarts; multi-replica gateways must share the same key
    material.
  • Token lifecycle edges: unknown or deleted sandbox records must not receive new
    gateway-minted tokens.

Testing

  • mise run pre-commit
  • cargo test -p openshell-server auth_rpc
  • cargo test -p openshell-server auth::k8s_sa
  • cargo test -p openshell-server log_stream_scope
  • cargo test -p openshell-driver-kubernetes
  • mise run helm:docs:check
  • mise run helm:lint
  • mise run helm:test
  • Kubernetes smoke e2e against a fresh k3d cluster via /helm-dev-environment
  • Docker smoke e2e
  • Podman smoke e2e

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 15, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@TaylorMutch TaylorMutch force-pushed the tmutch/gateway-config-impl branch 2 times, most recently from 381784e to 9bc2e11 Compare May 15, 2026 19:17
Base automatically changed from tmutch/gateway-config-impl to main May 15, 2026 19:43
@TaylorMutch TaylorMutch force-pushed the tmutch/per-supervisor-authn branch from 834b56e to f4daea6 Compare May 15, 2026 20:41
@TaylorMutch TaylorMutch changed the title feat: per-sandbox authentication feat: per-sandbox authentication to gateway May 15, 2026
@TaylorMutch TaylorMutch changed the title feat: per-sandbox authentication to gateway feat(auth): per-sandbox authentication to gateway May 15, 2026
@TaylorMutch TaylorMutch added the test:e2e Requires end-to-end coverage label May 15, 2026
@TaylorMutch TaylorMutch marked this pull request as ready for review May 15, 2026 22:07
@github-actions
Copy link
Copy Markdown

Label test:e2e applied for f4daea6. Open Branch E2E Checks, find the run for commit f4daea6, and click Re-run all jobs to execute with the label set. The E2E Gate check on this PR will flip green automatically once the run finishes.

@github-actions
Copy link
Copy Markdown

@TaylorMutch TaylorMutch force-pushed the tmutch/per-supervisor-authn branch 3 times, most recently from 719f5f5 to 0d7df90 Compare May 19, 2026 00:06
Comment thread crates/openshell-server/src/auth/k8s_sa.rs
@TaylorMutch TaylorMutch force-pushed the tmutch/per-supervisor-authn branch 2 times, most recently from 19b8737 to e910610 Compare May 19, 2026 20:41
Replaces the hard-coded sandbox-method / dual-auth / Bearer branches in
AuthGrpcRouter with a pluggable Authenticator chain that produces a
Principal::{User, Sandbox, Anonymous}. The principal is inserted into
request extensions for handler consumption.

PR-1 keeps the legacy metadata marker for sandbox principals so existing
handlers that read x-openshell-auth-source continue to work; the marker
is removed in the PR-3 wire break. The OidcAuthenticator wraps the
existing JwksCache::validate_token for User principals, and the
LegacySandboxMarkerAuthenticator preserves the pre-refactor path-based
behavior pending the gateway-minted JWT flow in PR 2/3.

Part of the per-sandbox identity series that closes #1354.

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Adds the gateway-side infrastructure for per-sandbox identity tokens (the
PR-2 step of the series resolving #1354):

- New Ed25519 keypair generated by `certgen` alongside the existing PKI.
  Local mode writes `<dir>/jwt/{signing.pem,public.pem,kid}`; K8s mode
  creates an Opaque `<release>-jwt-keys` Secret.
- `SandboxJwtIssuer` mints tokens with EdDSA-signed claims (SPIFFE-shaped
  `sub`, denormalised `sandbox_id`, 24h default TTL, `jti` for revocation).
- `SandboxJwtAuthenticator` validates tokens through the Authenticator
  chain and yields `Principal::Sandbox(BootstrapJwt {..})`. Tokens with a
  different `kid` fall through so non-matching Bearer headers reach the
  OIDC authenticator unchanged.
- `K8sServiceAccountAuthenticator` is path-scoped to `IssueSandboxToken`;
  consumes a projected SA token and produces a `K8sServiceAccount` sandbox
  principal that the new `IssueSandboxToken` handler exchanges for a fresh
  gateway JWT.
- In-memory `RevocationSet` with TTL pruning, ready for the PR-3
  delete-side hook and PR-5 refresh.
- Helm chart mounts the JWT secret on the gateway pod and wires
  `[openshell.gateway.gateway_jwt]` into the rendered TOML.

PR 2 is additive: no driver yet writes a sandbox token, no supervisor yet
presents a Bearer JWT. PR 3 wires the consumer ends and removes the
legacy path-based sandbox marker.

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Switches every sandbox-to-gateway gRPC call from "path-based mTLS-only
trust" to "Authorization: Bearer <gateway-minted-JWT>" presented by the
sandbox supervisor. Closes the trust-boundary half of issue #1354; the
per-handler sandbox_id equality check follows in PR 4.

Sandbox side:
- crates/openshell-sandbox/src/grpc_client.rs gains an AuthInterceptor
  that injects the Bearer header on every outbound RPC. The token is
  resolved at startup from one of three sources, in order:
    1. OPENSHELL_SANDBOX_TOKEN (env, test harnesses)
    2. OPENSHELL_SANDBOX_TOKEN_FILE (Docker/Podman/VM drivers)
    3. OPENSHELL_K8S_SA_TOKEN_FILE (K8s driver — projected SA token
       exchanged for a gateway JWT via IssueSandboxToken)

Gateway side:
- handle_create_sandbox mints a gateway JWT and passes it through the
  compute layer to DriverSandboxSpec.sandbox_token. K8s sandboxes ignore
  the field; Docker and Podman drivers inject it as OPENSHELL_SANDBOX_TOKEN
  in the container env.
- Removes the path-based SANDBOX_METHODS / DUAL_AUTH_METHODS branches and
  the x-openshell-auth-source metadata marker. The AuthGrpcRouter chain
  is now uniform: K8s SA -> SandboxJwt -> OIDC, all extension-based.
- Removes LegacySandboxMarkerAuthenticator and the SandboxIdentitySource::
  LegacyMarker variant. Handlers read Principal::Sandbox directly from
  request extensions.

Kubernetes driver:
- Sandbox pods gain a projected ServiceAccount token volume mounted at
  /var/run/secrets/openshell/token (audience openshell-gateway, 1h TTL,
  kubelet auto-rotates).
- Each pod is annotated with openshell.io/sandbox-id; the gateway resolves
  the SA token claim's pod uid back to a sandbox id via this annotation.
- Helm Role grants the gateway pods:get in the sandbox namespace. No
  ClusterRoleBinding to system:auth-delegator — the gateway validates SA
  tokens against the apiserver's anonymous JWKS endpoint instead of via
  TokenReview, so no cluster-scoped privilege is required.

The full JWKS verifier + pod-annotation lookup lands in the follow-up
that brings the K8s helm-dev demo end-to-end; PR 3 exercises the wire
break with Docker/Podman as the working drivers.

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
ProcessHandle::spawn_impl previously inherited the supervisor's full
environment when starting the sandbox entrypoint, then drop_privileges()
demoted the child to the sandbox user. The combination meant a later
process running as the sandbox user (e.g. an SSH-spawned shell) could
read /proc/<entrypoint_pid>/environ and recover the gateway-minted JWT.

Explicitly env_remove the three sandbox-token env vars before exec so the
entrypoint child carries none of the supervisor's identity material.
SSH session shells already use env_clear() in apply_child_env, so this
plugs the only remaining inheritance path.

Related to #1354 (per-sandbox identity series, PR 3 follow-up).

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Adds the IDOR guard that closes the second half of the per-sandbox
identity series. Every sandbox-class handler now verifies that the
calling Principal::Sandbox.sandbox_id matches the canonical UUID the
request body operates on. User principals bypass the check because
RBAC was their gate at the router layer; anonymous callers are
rejected outright.

New module crates/openshell-server/src/auth/guard.rs exposes
ensure_sandbox_scope / enforce_sandbox_scope. Applied at the top of:

- handle_get_sandbox_config (id-keyed)
- handle_get_sandbox_provider_environment (id-keyed)
- handle_report_policy_status (id-keyed)
- handle_push_sandbox_logs (id-keyed, first frame only — principal is
  stable across the stream)
- handle_submit_policy_analysis (name-keyed: resolve to id, then check)
- handle_get_draft_policy (name-keyed)
- handle_update_config (dual-auth: enforce only when Principal::Sandbox;
  CLI / TUI user paths are unaffected)
- handle_get_inference_bundle (no sandbox_id in body; accept any
  authenticated principal, reject anonymous)

Existing policy.rs tests are updated to wrap their requests with a
test-helper user principal so the new guard treats them as CLI calls;
six new tests cover the cross-sandbox-denied / same-sandbox-allowed /
user-bypasses-guard matrix.

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Adds the rotation half of the per-sandbox identity series. Sandboxes
holding a valid gateway-minted JWT can swap it for a fresh one without
disruption; the old jti is revoked server-side before the new token is
handed back, so a leaked token is unusable as soon as the rotation
completes.

Server side:
- proto/openshell.proto gains RefreshSandboxToken plus empty request /
  token+expires_at_ms response messages.
- handle_refresh_sandbox_token requires Principal::Sandbox with a
  BootstrapJwt source (K8s-SA principals are routed to IssueSandboxToken
  for bootstrap; user principals are rejected). The handler mints the
  replacement token first, then adds the old jti to the in-memory
  RevocationSet — so a failed mint never strands the sandbox.

Sandbox side:
- AuthInterceptor now reads its Bearer header from a process-wide
  Arc<RwLock<AsciiMetadataValue>> slot, so a single in-place token
  rotation is visible to every cached client (CachedOpenShellClient, the
  supervisor session channel, log push, etc.).
- connect_channel spawns a background refresh loop once per process
  that sleeps for ~80% of the token's remaining lifetime (clamped to
  60s-12h, plus small deterministic jitter) and calls
  RefreshSandboxToken, updating the token slot on success.
- New parse_jwt_exp_ms helper decodes the JWT payload without signature
  verification — the token's origin is already trusted via the
  acquisition flow.

Tests:
- 4 server-side handler tests (round-trip, user-principal rejected,
  K8s-SA-principal rejected, missing-issuer returns Unavailable)
- 3 sandbox-side helper tests (parse-exp, 80%-of-TTL delay, 60s floor)

All existing OpenShell test impls gain a refresh_sandbox_token stub.

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
The projected SA token kubelet writes to each sandbox pod was previously
a hardcoded 3600s literal in the driver. Operators in tighter audit
regimes want to dial it lower; very large clusters may want it slightly
higher to absorb token-refresh churn.

Wires `sa_token_ttl_secs` through three layers:

- KubernetesComputeConfig gains the field (default 3600). The driver
  clamps to [600, 86400] via `effective_sa_token_ttl_secs()`: 600s is
  kubelet's enforced minimum, 24h is the cap (the token is consumed
  within seconds of pod start, so longer is almost always a
  misconfiguration).
- The openshell-driver-kubernetes binary exposes
  `--sa-token-ttl-secs` / `OPENSHELL_K8S_SA_TOKEN_TTL_SECS`.
- `[openshell.gateway].sa_token_ttl_secs` in the gateway TOML inherits
  into `[openshell.drivers.kubernetes]`, mirroring the
  `enable_user_namespaces` plumbing.
- Helm: `server.sandboxJwt.k8sSaTokenTtlSecs` (default 3600) renders
  into the K8s driver block of the gateway config.

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Replaces the LiveK8sResolver stub with a working validator. Sandbox pods
present their projected ServiceAccount token via Authorization: Bearer
on IssueSandboxToken; the gateway:

1. Decodes the JWT header and looks up the signing key.
2. On miss, fetches the apiserver's /.well-known/openid-configuration
   discovery doc + /openid/v1/jwks via kube::Client and caches the keys.
3. Validates the token's signature (RS256), issuer, audience
   (openshell-gateway), and expiry.
4. Reads `kubernetes.io.pod.{name,uid}` from the claims and GETs the
   pod in the gateway's sandbox namespace.
5. Verifies the live pod's UID matches the token's UID (defense against
   replayed tokens from recreated pods with the same name) and reads
   the openshell.io/sandbox-id annotation to derive the sandbox UUID.

The gateway needs no system:auth-delegator ClusterRoleBinding — JWKS
validation is local, so the only K8s permission it consumes is the
namespace Role's `pods: get` grant. Discovery + JWKS reads ride the
gateway's existing kube::Client auth (system:service-account-issuer-
discovery is bound to system:authenticated in every supported K8s
distro).

ServerState gains an in-cluster detection path in run_server: when
KUBERNETES_SERVICE_HOST is set AND a sandbox JWT issuer is configured,
construct the resolver and wire it as state.k8s_sa_authenticator. The
existing K8sServiceAccountAuthenticator (path-scoped to
IssueSandboxToken) becomes functional.

Tests: JWKS path parsing covers absolute URL, relative path, query
string, and garbage rejection. End-to-end validation against a real
apiserver is exercised in the helm-dev demo.

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Three regressions / inefficiencies surfaced while bringing the per-sandbox
identity series up end-to-end in the local helm cluster:

1. CLI returned Unauthenticated against a no-OIDC dev gateway.
   PR 3 removed the pre-refactor "no OIDC = pass through" behavior; with
   only sandbox-side authenticators in the chain, plain user CLI calls
   hit Unauthenticated. Add a PermissiveUserAuthenticator that
   installs as a final fallback when no OIDC is configured but sandbox
   JWT signing IS — produces a synthetic dev-anonymous user principal so
   the rest of the handler chain treats CLI calls as User and bypasses
   the IDOR guard. Production OIDC deployments are unaffected: when
   OIDC is configured the fallback is not installed and missing-Bearer
   still 401s.

2. Sandbox supervisor re-ran the K8s SA bootstrap exchange on every
   connect_channel() call. With multiple subsystems each building their
   own channels, IssueSandboxToken was firing every few seconds even
   though TOKEN_SLOT already had a fresh token. Change connect_channel
   to reuse TOKEN_SLOT when populated; only run acquire_sandbox_token on
   the first call per process. The refresh loop keeps the slot fresh
   thereafter.

3. K8s SA authenticator looked up sandbox pods in the gateway's own
   namespace (POD_NAMESPACE) instead of the K8s driver's configured
   sandbox namespace. Source from kubernetes_config_from_file() so the
   resolver targets the same namespace the driver creates pods in.

Verified end-to-end against the helm-dev cluster:
- Two sandboxes get distinct gateway JWTs with their own sandbox UUIDs.
- Cross-sandbox GetSandboxConfig is rejected with PermissionDenied and
  the auth::guard audit log fires with both principal and requested IDs.
- RefreshSandboxToken mints a new JWT and revokes the old jti; the old
  token is then rejected with Unauthenticated: revoked token.

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
…testing

Adds a small subcommand to the supervisor binary that issues one-shot
sandbox-class RPCs against the gateway using the supervisor's existing
token-acquisition pipeline. Designed to be invoked via docker exec or
kubectl exec into a running sandbox to verify the per-sandbox identity
flow end-to-end without writing a custom test binary inside the sandbox
image.

Subcommands:
- get-sandbox-config --sandbox-id <UUID> — call GetSandboxConfig
- refresh                                — call RefreshSandboxToken
- show-token                             — print raw gateway JWT bytes
- show-principal                         — pretty-print decoded JWT claims

Verification flow this enables (Docker path):
  docker exec sandbox-a openshell-sandbox debug-rpc show-principal
  docker exec sandbox-a openshell-sandbox debug-rpc \
      get-sandbox-config --sandbox-id <sandbox-b-uuid>
  # → exit code 7 + "PermissionDenied: cross-sandbox access denied"

K8s path: same RPCs, kubectl exec instead.

show-token and show-principal intentionally don't trigger the K8s SA
bootstrap exchange — they only read an already-cached token, so
inspection doesn't burn a fresh JWT mint per call.

Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
Signed-off-by: Taylor Mutch <taylormutch@gmail.com>
@TaylorMutch TaylorMutch force-pushed the tmutch/per-supervisor-authn branch from 602d5ea to a5cb368 Compare May 19, 2026 23:23
@TaylorMutch TaylorMutch added the test:e2e-kubernetes Requires Kubernetes end-to-end coverage label May 20, 2026
@github-actions
Copy link
Copy Markdown

Label test:e2e-kubernetes applied for a5cb368. Open the existing run and click Re-run all jobs to execute with the label set. The matching required CI gate status on this PR will flip green automatically once the run finishes.

Require persisted sandbox records before IssueSandboxToken and RefreshSandboxToken mint gateway JWTs. This closes the stale-token path where a deleted sandbox identity could continue refreshing itself until token expiry windows were repeatedly extended.

Pin PushSandboxLogs streams to the first validated sandbox id. A sandbox now validates scope and sandbox existence once, then any later batch that changes sandbox_id is rejected instead of being accepted under the original validation.

For Kubernetes bootstrap, add service_account_name to the Kubernetes driver config, set it on sandbox pod specs, and require TokenReview usernames to match system:serviceaccount:<sandbox-namespace>:<service-account>. The Helm chart provisions a dedicated sandbox ServiceAccount, places it in the sandbox namespace, scopes sandbox RBAC there, and writes the generated name into gateway.toml.

Update Helm unit coverage, Helm README, gateway/driver docs, architecture notes, and debug-openshell-cluster guidance for the new sandbox ServiceAccount behavior.

Validation: mise run pre-commit; Kubernetes smoke e2e via helm-dev-environment/k3d; Docker smoke e2e; Podman smoke e2e.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage test:e2e-kubernetes Requires Kubernetes end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

security: per-sandbox secret binding for sandbox→gateway RPCs

2 participants