chore(deps): bump qs and express in /api#1
Closed
dependabot[bot] wants to merge 58 commits into
Closed
Conversation
* feat(codeapi): upgrade Python Bun and add chDB * fix(codeapi): prefer packaged runtimes in bash jobs * fix(codeapi): address runtime upgrade review comments * fix(codeapi): guard local Python package marker
ci: add CodeAPI unit test workflow
…ware sessionKey (#1455)
* refactor(codeapi): disambiguate session_id by name
Two distinct concepts shared the field name `session_id` and the
conflation has caused real bugs (most recently the worker overwriting
storage ids with the exec id on response). This PR splits them:
- **`storage_session_id`** — long-lived bucket id where a file's
bytes live in object storage; scoped to the file's owner (skill,
agent, user); used in file-server URLs and on every per-file ref.
- **`execution_session_id`** — transient id for one sandbox `/exec`
invocation; created fresh per call (or reused for continuation);
torn down post-run; used to address an in-flight execution.
The same value sometimes flows through both roles (worker-generated
output files use Job.uuid as the storage prefix, which equals the
execution session id) — but the naming now makes the distinction
explicit at the type system instead of relying on convention.
Wire compat:
- Public API request: `RequestFile` accepts both `storage_session_id`
(canonical) and the legacy `session_id` alias for one cycle;
`validateRequestedFiles` normalizes early.
- Public API response: `FileRef`, `ExecuteResult`, `ProgrammaticResponse`
emit both names for one cycle so old clients keep working.
- Intra-monorepo wires (service-api ↔ sandbox payload, TFile, worker
FileRef): hard rename, single source of truth — both ends ship
together so no compat alias needed.
Crypto- and state-storage compat preserved:
- `ExecutionManifestInputFile.session_id` keeps the legacy name
because changing it would invalidate every signed manifest in
flight; transformer in `execution-manifest-claims.ts` maps from
the renamed `PayloadFile.storage_session_id`.
- `ExecutionState.session_id` keeps the legacy name to preserve
Redis-state deserialization across the rolling deploy; usages
that pass it to renamed APIs map at the call site.
Files touched:
- service/src/types/service.ts: type rename + wire-compat aliases
- service/src/payload.ts: forward new names
- service/src/preamble.ts: programmatic payload builder
- service/src/execution-manifest-claims.ts: payload→manifest
transformer maps storage_session_id ↔ session_id
- service/src/service/file-authorization.ts: validator accepts both
on input, normalizes to storage_session_id internally
- service/src/service/router.ts: rename local exec session vars
- service/src/service/programmatic-router.ts: map state.session_id
to execution_session_id at the createProgrammaticPayload call
- service/src/workers.ts: emit both names on the public ExecuteResult
- api/src/job.ts: TFile, FileRef, Job constructor, walker
- api/src/api/v2.ts: ExecuteRequestBody hard rename
- Tests in api/src and service/src updated to match
Tests: 104/104 worker tests pass (walker, download, job-helpers);
21/21 service tests pass (file-authorization, execution-manifest);
typecheck clean across both packages (only pre-existing unrelated
errors remain — `workers.ts:99` `err: any`).
* feat(codeapi): kind discriminator + tenant-aware sessionKey (Phase C)
Drops legacy `entity_id`-driven scoping in favor of an explicit
`kind` ('skill' | 'agent' | 'user') + optional `version` on every
`RequestFile`, the `/upload` form fields, and the `/sessions/:id`
URL query. Codeapi switches on `kind` to derive the sessionKey:
- skill: <tenant>:skill:<id>:v:<version>
- agent: <tenant>:agent:<id>
- user: <tenant>:user:<authContext.userId>
Cross-user-within-tenant sharing for `'skill'` / `'agent'` is now a
designed property of the kind switch — the user dimension is omitted
from the sessionKey for those kinds — instead of an emergent side
effect of the legacy `entity_id` derivation. Skill `version` is
required (validator-enforced) so an edit naturally invalidates the
prior cache entry under the new sessionKey.
Output buckets for /exec are hardcoded user-private regardless of
input file kinds (`resolveOutputBucketSessionKey`). Skill executions
do NOT produce skill-scoped output buckets — that's a deliberate
behavioral change from the legacy entity_id flow.
Tenant prefix is server-side from `req.codeApiAuthContext.tenantId`.
`TENANT_ISOLATION_STRICT=true` fails closed with a 500 when tenantId
is missing; the default fallback to `'legacy'` is intentional for
single-tenant deploys without an auth tenancy concept. Multi-tenant
deploys MUST set the flag before any tenant is multi-homed.
Drops the `entity_id` field entirely from `RequestFile`, `FileRef`,
`RequestBody`, `ProgrammaticRequestBody`, and the inline payload-file
shape. Drops the `validateEntityId` middleware and the
`validateEntityIdString` helper. Drops the `args.entityId` parameter
from `authorizeRequestedFiles`.
Companion changes (in flight, lockstep deploy):
- LC #12960 sends kind/id/version via the upload form fields and
the typed `CodeEnvRef` discriminated union (version required for
`kind: 'skill'`)
- agents #148 forwards kind/version on `_injected_files` for /exec
Tests:
- resolveSessionKey per-kind shapes
- cross-tenant isolation, within-tenant sharing
- TENANT_ISOLATION_STRICT fail-closed in strict mode, fallback in
non-strict
- validator rejects missing/unknown kind, skill-without-version,
version-on-non-skill
- output bucket regression: cross-user skill execution doesn't
produce a skill-scoped output bucket
- parseUploadSessionKeyInput shape validation
* fix(codeapi): populate codeApiAuthContext + log sessionKey resolution errors
Two issues that surfaced when /exec was hit with a minimal payload
(`{lang, code}` — no files):
1) `req.codeApiAuthContext` is read by `resolveSessionKey`,
`resolveOutputBucketSessionKey`, and several upload/download paths,
but no production middleware was populating it. Only the test
files mocked it, so the unit tests passed while production /exec
threw "resolveOutputBucketSessionKey: authContext.userId is missing"
on every call. Pre-Phase C `resolveSessionKey` had an explicit
`fallbackUserId` parameter that masked this; Phase C dropped the
parameter and made `codeApiAuthContext.userId` the sole source,
exposing the missing wire-up.
Fix: in `apiKeyAuth`, after validating the API key, populate
`req.codeApiAuthContext = { ...existing, userId }`. Spreads any
future tenantId enrichment a downstream middleware might add
(multi-tenant cutover) without overwriting it.
2) The `SessionKeyResolutionError` catch blocks send the error to
the response body but never log server-side. A misconfigured
middleware or malformed kind/version payload would 400/500 with
no log trail — silent in production until a user reports it.
`sendFileRefAuthorizationError` already logs via `logger.warn` on
the same kind of validation rejection; the new sessionKey paths
diverged from that pattern.
Fix: a `sendSessionKeyResolutionError` helper (mirrors the
file-ref helper's return-true-when-handled shape) logs the
rejection with auth/request context — userId from both auth
sources, tenantId, method, path — before sending the response.
Wired at all five catch sites: output bucket on /exec
(router.ts + programmatic-router.ts /exec + /exec-blocking) and
the two parse/resolve sites in `sessionAuth`.
Both fixes are infrastructure-only — no behavior change beyond
"silent 500 → useful log line + identical 500" and "fail-on-load →
work as designed". The unit test surface didn't catch (1) because
every test mocks `codeApiAuthContext` directly; an integration test
that exercised real `apiKeyAuth → resolveOutputBucketSessionKey`
would have. Worth adding in a follow-up.
* fix(codeapi): populate codeApiAuthContext in localAuth middlewares too
Follow-up to 4c012d1. The fix populated codeApiAuthContext in
apiKeyAuth (production), but LOCAL_MODE uses a separate localAuth
middleware (api-server.ts and local-api.ts), which still set
req.apiKey but skipped the new context populate. Result: prod /exec
worked, local-mode /exec still 500'd with the same 'authContext.userId
is missing' — same root cause, different code path.
Both localAuth paths now mirror apiKeyAuth's populate. Same spread-
preserving shape so a future tenantId enrichment lands without
overwriting.
* fix(codeapi): split RequestFile.id into id (storage) + resource_id (sessionKey)
The single `id` field on `RequestFile` was overloaded for two
incompatible purposes:
1. **SessionKey resolution** in `resolveSessionKey` —
`<tenant>:<kind>:<id>:v:<version>` wants the RESOURCE id
(skill `_id`, agent id).
2. **Upload-existence lookup** —
`upload:<sessionKey><storage_session_id><id>` wants the STORAGE
id (the file_server's per-file nanoid registered at upload
time).
For `kind: 'user'` the two values incidentally collapse to the
same userId-derived string, so the conflation went unnoticed in unit
tests. For shared kinds the resource id (skill mongo `_id`) and
the storage id (21-char nanoid) are fundamentally different values,
and every `/exec` authorization with a skill-priming chain failed
with `session_key_mismatch` because the sessionKey re-derivation
used the storage nanoid where the resource id was required:
cached: legacy:skill:69dcf561...:v:59 (signed at upload, resource id)
derived: legacy:skill:ysPwEURuPk-...:v:59 (request, storage id)
## Fix
`RequestFile` now carries both. `id` keeps its existing meaning
(storage file_id — matching the worker-side `TFile.id` storage
URL contract; no churn on the api/ workspace). New `resource_id`
field carries the entity-that-owns-this-file's-session identity for
sessionKey resolution.
`authorizeRequestedFiles` maps at the boundary:
`SessionKeyInput.id ← file.resource_id` (sessionKey input)
`uploadKey suffix ← file.id` (storage existence check)
The manifest's per-file `id` continues to embed the storage id
(no version bump needed for in-flight signed tokens).
## Test plan
- [x] Service tests: 51 / 51 pass across `file-authorization.test.ts`
(incl. new 'rejects missing resource_id' regression),
`session-key.test.ts`, `execution-manifest.test.ts`,
`config.spec.ts`.
- [x] Worker tests: 139 / 139 pass.
- [x] `bunx tsc --noEmit` clean (only pre-existing stripe webhook
error remains, unrelated).
## Coordination
Companion: agents `3.1.80-dev.1` (adds `CodeEnvFile.resource_id`,
ToolNode threads it through `_injected_files`); LibreChat side
sends `resource_id: skill._id` on shared-kind file refs.
* fix(codeapi): accept Mongo ObjectId / agent slug as resource_id
`isValidId` is the 21-char nanoid regex (`/^[A-Za-z0-9_-]{21}$/`)
designed for sandbox-generated ids. Validating `resource_id` with
the same rule rejected legitimate upstream resource identifiers:
- Skills: MongoDB `_id` is 24-char hex (`69dcf561f37f717858d4d072`)
- Agents: LibreChat agent slug is `agent_<11-char-nanoid>` (17 chars,
contains an underscore)
LC's primeInvokedSkills sends the skill `_id` as `resource_id` —
which is exactly what's needed for codeapi's sessionKey re-derivation.
But the validator was 400'ing every shared-kind /exec with
"files[0].resource_id is invalid" before authorization could run.
## Fix
New `isValidResourceId` helper accepts the broader shape
`/^[A-Za-z0-9_.:-]+$/` length-bounded to [1, 128]. Kept liberal
because over-tight format checks become cross-org integration
friction; the sessionKey itself is the tamper-resistance boundary,
this validator just rejects obvious garbage (whitespace, control
chars, unbounded length).
## Test plan
- [x] `bun test src/service/file-authorization.test.ts` — 31 / 31
pass (3 new cases covering 24-char Mongo ObjectId, agent slug,
and whitespace rejection)
* fix(codeapi): enrich file-ref validator rejections with diagnostic context
400 rejections like "files[0].resource_id is invalid" were thrown
without context, so the warn log in `sendFileRefAuthorizationError`
had nothing to spread from `error.context`. Operators couldn't tell
the offending value from the wrong type from a missing field — every
failure mode looked identical in logs.
Each validator throw now carries `{ index, field, type, length?,
value? | sample? }` via `describeValue`. Short strings (≤64 chars,
typical id/slug shapes) inline whole; longer ones are sampled
head…tail so logs stay bounded.
* fix(codeapi): rename /upload response field session_id → storage_session_id
Single-file `/upload` was emitting `session_id` in its response while
`/upload/batch` emitted `storage_session_id` (per the typed
`BatchUploadResponse`). The rename to `storage_session_id` (Phase B/C
disambiguation) was applied to the batch route but missed the single
one — a regression that silently broke LC's chat-attach upload path:
LC reads `result.storage_session_id` → undefined
persists `metadata.codeEnvRef.storage_session_id = undefined`
next /exec primes file with `storage_session_id: 'undefined'`
sandbox: FileNotFoundError
Visible in codeapi logs as both "Invalid session ID: undefined"
(LC's getSessionInfo hitting `/sessions/undefined/objects/...`) and
"files: []" on /exec (LC dropped the file from priming because
`primeCodeFiles` couldn't refresh the stale ref).
Fix: rename the single-route response field to match the batch route.
Also adds an explicit `UploadResponse` interface in `types/files.ts`
mirroring `BatchUploadResponse` so the contract is statically pinned
on both endpoints — this regression slipped past TS because the
single response was an inline object literal with no declared type.
* test(codeapi): lock the sprint's regression surface
Two regression test files covering the bugs that took the longest to
diagnose this sprint:
**`utils.test.ts`** — direct unit tests for `isValidId` and the
sprint-added `isValidResourceId`. Previously only exercised
indirectly via `validateRequestedFiles`, which conflated
"resource_id was missing" with "resource_id was malformed".
The validator pair is the boundary between:
- storage uuids that file_server / codeapi mint themselves
(`isValidId`, strict 21-char nanoid)
- heterogeneous upstream identity formats (skill _id, agent slug,
user id) that flow in via `resource_id` (`isValidResourceId`,
deliberately looser)
Conflating them was the original bug — `isValidId` rejected the
24-char Mongo ObjectId LC sends as the skill resource id, 400ing
every shared-kind /exec. Lock both validators' acceptance/rejection
spectrum so the next refactor can't quietly tighten the looser one
(or loosen the stricter one) without surfacing in the test trail.
**`types/files.test.ts`** — wire-shape contract for `UploadResponse`
and `BatchUploadResponse`. The single `/upload` route was emitting
`session_id` while `/upload/batch` emitted `storage_session_id`,
silently breaking LC's chat-attach upload chain because clients
read `result.storage_session_id`.
Compile-time `Equal<...>` + `Has<...>` type assertions pin:
- both responses' storage_session_id field is `string`
- neither type carries the legacy `session_id` key
- the two routes agree on the field name (defends against
drift in either direction)
Plus runtime `expect(sample).not.toHaveProperty('session_id')`
samples that catch a regression even if the `: UploadResponse`
annotation is removed from the route handler.
## Test plan
- [x] `bun test src/utils.test.ts src/types/files.test.ts` —
19/19 pass (41 expect() calls)
- [x] Full codeapi service suite: 83/83 pass (the 21 lib/api-keys
failures are pre-existing environmental — 401/404 against remote
services, unrelated to this sprint)
* fix(codeapi): propagate sessionKey 500 errors from batch upload (codex P1)
`/upload/batch`'s per-file `resolveSessionKey` catch was downgrading
every failure to a per-file error string (`{ status: 'error', filename,
error }`). For 400-class faults (client supplied bad kind/id, etc.)
that's the right call — partial-success batches are by design. But
for 500-class faults (server misconfiguration like a strict-mode
tenantId gap surfaced by `resolveTenantPrefix`), it masked an
operational breakage as a `200 partial_success` or `400 error`
response instead of the 500 the caller needs to alert on.
Latch the first 500 SessionKeyResolutionError into a `serverError`
flag and convert it into a single batch-level 500 in the
`bb.on('finish')` handler, before the success/partial-success
response is built. Per-file 400 faults continue to flow through the
existing per-file error path.
Codex review on PR #1455.
## Test plan
- [x] `bun test src/{utils,types/files,service/file-authorization,session-key}.test.ts` — 66/66 pass
- [x] `bunx tsc --noEmit` — clean (modulo pre-existing `@librechat/api-keys` import error in unrelated `src/enterprise/router.ts`)
* feat: Verify LibreChat CodeAPI JWTs * fix: Enforce CodeAPI continuation context * fix: Refresh CodeAPI JWT key directory cache * fix: Harden CodeAPI JWT auth config * test: Cover PTC replay auth state * fix: Preserve JWT plan context * fix: Normalize JWT key directory errors * fix: Enforce CodeAPI JWT startup contract * chore: Support local CodeAPI JWT env setup * fix: Respect single-tenant CodeAPI auth mode * fix: Log unmatched CodeAPI routes * fix: Summarize CodeAPI execution logs * fix: Reject empty legacy auth config
* feat(codeapi): add sandbox egress gateway * fix(codeapi): address egress review findings * chore(codeapi): quiet egress health probe logs * fix(codeapi): resolve egress gateway for microvm * fix(codeapi): pass sandbox manifest verifier * fix(codeapi): refresh egress grants at dispatch * fix(codeapi): force ptc callbacks through gateway * fix(codeapi): require egress gateway for sandbox charts * fix(codeapi): bound ptc callback grant ttl * fix(codeapi): harden ptc limits and network policy defaults * fix(codeapi): remove egress grant ttl cap * fix(codeapi): carry sandbox grants in execute body * fix(codeapi): handle decoded egress route handles * fix(codeapi): require asymmetric sandbox manifests * fix(codeapi): allow scoped redis network egress * fix(codeapi): preserve manifest hmac fallback * fix(codeapi): harden egress and tool-call scope * test(codeapi): cover egress gateway routes * fix(codeapi): tighten egress restore scope
* feat(codeapi): harden sandbox egress isolation * fix(codeapi): address egress gateway review findings * fix(codeapi): harden egress gateway review gaps * fix(codeapi): quiet expected queue listener bursts * fix(codeapi): normalize sandbox forward target ports * fix(codeapi): gate egress grants to hardened mode * fix(codeapi): isolate Redis ledger mutations * fix(codeapi): accept legacy egress grants during rollout * fix(codeapi): preserve gateway egress outside strict mode * fix(codeapi): preserve dirkeep and empty tool scopes * fix(codeapi): fail closed for orphan egress secrets * test(codeapi): cover extensionless build file outputs * fix(codeapi): harden egress grant edge cases * fix(codeapi): isolate sandbox retry workspaces * fix(codeapi): keep sandbox workspace ids short * fix(codeapi): allow manifest public key fallback * fix(codeapi): require sandbox manifest public key env * fix(codeapi): restore token-only egress results * fix(codeapi): restore token-only worker results * fix(codeapi): gate egress grants explicitly * fix(codeapi): require gateway internal auth * fix(codeapi): rollback only reserved uploads * fix(codeapi): clean up blocking setup failures * fix(codeapi): tune sandbox throughput concurrency
* refactor(codeapi): migrate package root to pkgs * fix(codeapi): harden sandbox kernel attack surface * fix(codeapi): preserve legacy package directory fallback
* Harden sandbox workspace isolation * Fail fast without per-job workspace ownership * Preserve rootless legacy workspace access * Harden workspace cleanup failure handling * Recover retained sandbox UID slots after cleanup * Release sandbox UID when workspace allocation fails * Avoid following symlinks when collecting outputs * Verify sandbox workspace write in smoke test * Tune CodeAPI sandbox chart timeouts
* ci(codeapi): use non-fips runner for image builds * feat(codeapi): split worker autoscaling controls * fix(codeapi): autoscale sandbox runner only * fix(codeapi): harden split worker replica migration
feat(codeapi): add secure sandbox metrics and autoscaling
* feat(codeapi): add kvm device plugin support * fix(codeapi): Gate KVM Plugin Readiness On Kubelet Registration
fix(codeapi): raise KVM launcher nofile limit
fix(codeapi): set KVM guest nofile rlimit
…jail 2026-05-08 (#1607) * chore(codeapi): block AF_ALG sockets at outer seccomp profile CVE-2026-31431 ("Copy Fail") exploits AF_ALG + splice() for a controlled 4-byte write into the page cache, enabling container escape. The inner Kafel policy already returns EPERM for socket(AF_ALG) on user code, but the outer container-runtime profile (seccomp/nsjail.json and the inline copy in helm/codeapi/templates/seccomp-profile.yaml) allowlisted socket() unconditionally - so host-side processes (libkrun, launcher, supervisor, sandbox API, nsjail) and the no-KVM fallback path had no defense-in-depth against AF_ALG. Replace the unconditional socket allow with an arg-filtered allow (SCMP_CMP_NE value=38) so AF_ALG falls through to the profile's defaultAction (SCMP_ACT_ERRNO, EPERM). Verified via docker run --security-opt seccomp=<profile> probe: AF_ALG -> EPERM(1), all other families still reachable. The two-rule shape (EQ -> ERRNO 97, ALLOW) suggested by some published mitigation notes does not compile under runc/libseccomp - mixing different actions for the same syscall returns EEXIST even with mutually-exclusive arg filters. The fall-through-to-default approach is the portable OCI-compatible expression. * chore(codeapi): bump Fedora 41 -> 43, default KVM-mode pods to RuntimeDefault seccomp Fedora 41 reached EOL on 2025-11-19. The launcher and worker-sandbox images had been receiving no security updates for their base layer (libkrun, openssl, util-linux, etc.) since then. Bumping to Fedora 43 (current active release) also picks up libkrunfw ~5.2.0 in place of ~4.10.0, which bumps the bundled libkrun guest kernel from 6.12.34 to 6.12.76 - nine months of stable-tree backports including the io_uring UAF range (CVE-2025-38106, CVE-2026-1042). Verified end-to-end via docker compose -f docker-compose.yaml: api -> service-worker -> sandbox-runner path returns python 3.14.4 / guest_kernel 6.12.76 / numpy 2.4.4 + pandas 2.3.3 work / AF_ALG, AF_INET, AF_NETLINK, egress all blocked / hardened-mode manifest enforcement returns 401 on direct runner bypass. Helm chart: KVM-mode worker-sandbox pods previously set no seccompProfile at all, leaving them on whatever the cluster default was (typically Unconfined). Default them to RuntimeDefault, which covers the existing workload (matches what Docker applies in compose) without requiring node-side seccomp profile installation. Custom NsJail seccomp profile is now exposed via workerSandbox.seccomp.enabled in values.yaml (still default false). When true, both KVM and non-KVM-mode pods reference Localhost: profiles/nsjail.json, which must be node-installed first per seccomp/README.md. The previous template silently rendered the ConfigMap when seccomp.enabled was set but never wired it into the pod, so this also closes that gap. * chore(codeapi): drop io_uring_* from outer seccomp profile io_uring is the source of recurring container-escape and LPE CVEs (2025-38106, 2026-1042 UAF, plus several earlier UAF/double-fetch issues). Nothing in codeapi explicitly opens io_uring contexts - neither Bun (host-side sandbox API), the Rust launcher/supervisor, nor nsjail/libkrun depend on it; they all use epoll/eventfd from the existing allow-list. Verified with 8-way concurrent mixed-runtime load (py + bun-js, 1MB allocations each) plus full smoke matrix: no fallback noise, no EPERM/seccomp denials, latencies unchanged. Removing the three io_uring_* names shrinks the host-side attack surface across kernel io_uring code paths (request setup, register, submission ring), which the inner Kafel policy already blocks for user code under nsjail. The outer profile change covers the remaining host-side processes and the no-KVM fallback path. * chore(codeapi): bump nsjail pin to 66ad78dc (2026-05-08) Old pin (dccf911f) was from 2021-11-24 - 4.5 years stale. The new pin brings four substantive upstream fixes since the previous checkout: - 66ad78dc (2026-05-08) unotify: ignore ENOENT/EINPROGRESS log noise - 0c95a1f7 (2026-05-07) unotify: fix race + spurious POLLHUP hangs; this is the major fix - the old code reused stale req.id with isTargetAlive across separate threads, causing hangs and forcing GODEBUG=asyncpreemptoff=1 for Go workloads. Also reworks send-error handling (EINTR retry, ENOENT continue, otherwise log+continue) - 9853352c (2026-05-05) net: fix double-free of nl_cache in initParent error paths (latent UAF in our network namespace setup) - 5b52bf10 (2026-05-05) Fix ESRCH in sched_setaffinity due to glibc TID caching with clone3 Build deps already satisfied (bison/flex/libprotobuf-dev/libnl-route-3-dev/ protobuf-compiler/pkg-config); new Makefile needs C++20 which bookworm gcc 12 supports. Kafel submodule auto-fetched via explicit recursive init plus the existing Makefile kafel_init rule. New nsjail compiles the unotify/ and nstun/ subtrees added upstream. Verified end-to-end on the canonical compose stack with the custom outer seccomp profile applied: - sandbox-runner NsJail self-smoke passes on boot - /v1/exec works across py/bun-js/node/bash runtimes (latencies unchanged from old pin) - isolation matrix unchanged (AF_UNIX reachable, AF_INET/NETLINK/ALG blocked, egress blocked) - memory limit fires (Python MemoryError on 2GB alloc) - 10-way concurrent mixed-runtime stress completes in 3.2s with no errors and no unotify/kafel/fatal log noise - hardened mode still enforced (no-manifest -> HTTP 401)
* feat(codeapi): bake language packages into sandbox-runner image Adds packages.source: image mode that pulls runtime packages from a baked sandbox-runner-baked-<version> image instead of a ReadWriteOnce PVC. Image mode skips the PVC, the package-init Job, and the same-node podAffinity, letting sandbox-runner pods spread across nodes and AZs for KEDA autoscaling. Default remains packages.source: pvc for backwards compatibility; targeted for sunset once image mode has run in prod for 30 days. * fix(codeapi): gate scale-down hook on packages.source=pvc, bump build timeout - package-init-scale-down-worker hook could still render in image mode if forceRebuild and scaleDownWorker.enabled were left at their PVC-era values, even though the PVC and init Job no longer render. Add the same packages.source == "pvc" guard. - Bump build-docker timeout from 30m to 60m. PR checks skip Docker builds, so the first sandbox-runner-baked build on a cold cache runs at merge time and needs headroom for Python compile + pip install.
fix(codeapi): preserve custom affinity in baked package mode
…t race (#1617) * fix(codeapi): serialize NsJail mount-setup to close /tmp/nsjail.0.root race Concurrent NsJail launches inside one sandbox-runner microVM raced on the host-shared `/tmp/nsjail.<orig_uid>.root` setup directory (root runner -> orig_uid=0), surfacing as `nsjail exit 255`, workspace `ENOENT` under /mnt/data, and `EPERM chown '/tmp/sandbox'` follow-on failures during burst load. NsJail's mkdir+mount+chroot of that path is tolerant of EEXIST, so both jails happily reused the same directory while only the tmpfs over it was namespace-private. Introduce an in-process FIFO setup gate that serializes only the unsafe window: spawn() -> NsJail's `[I] ... Executing "<bin>"` log marker (the last line before execve, after all mounts complete). Once the marker appears (or a 1.5s watchdog fires) the gate releases and the inner job runs unmutexed alongside siblings, so per-pod throughput is preserved. - New `nsjail-setup-gate.ts` module with seamed fs/clock for testing. - `nsjail.execute()` routed through the gate; injectable per call site. - Watchdog fires emit `logger.warn` plus a new Prometheus counter `codeapi_sandbox_nsjail_setup_gate_watchdog_fires_total` so a regression (repeat watchdog hits = setups overlapping again) is visible to oncall. - Unit tests cover marker release, FIFO under 64 concurrent acquirers, watchdog backstop, leading-space marker discipline (`Executing` at column 0 in user output does not trip), spawn-exception cleanup, and non-ENOENT readFile error handling. - Adds `services/codeapi/tests/concurrent_jobs.py` reproducer that burst- fires N parallel bash jobs and tallies the symptoms above; meant to be run against the KVM-enabled local stack before/after the fix and retained as a regression test. * fix(codeapi): hold NsJail setup gate until watchdog on log read errors Review feedback. Two correctness issues from the initial PR: 1. Gate failed open on non-ENOENT readFile errors. The previous pollForMarker rethrew, and runSetup's `.catch(() => false)` released the gate immediately — defeating the gate's purpose during exactly the kind of transient fs hiccup (EACCES on a chmod race, EISDIR if the path got clobbered, EIO) where overlapping NsJail setup is most dangerous. The poll loop now treats any read error as "marker unavailable", retains the last error, and releases only at the watchdog. nsjail.execute() logs the retained code/message so the warn line is diagnostic instead of a bare timeout. New test asserts the gate is held for the full watchdog window under a persistent EISDIR. 2. Reproducer would PASS on a 2xx with a non-JSON body. The runner always emits structured JSON on the happy path, so a plain-text/HTML 2xx (proxy interstitial, malformed gateway response, etc.) silently succeeding masks real regressions. Added http_2xx_non_json category that covers both unparseable bodies and parseable bodies missing the run/compile envelope; counted as a failure in the exit code. * fix(codeapi): ignore ENOENT in NsJail setup-gate pollError tracking Review feedback. ENOENT during pollForMarker is the expected pre-flush state (NsJail has not yet written its `--log` file), not a diagnosable failure. The previous loop captured every readFile exception as lastError, so the watchdog warn line would surface ENOENT on a perfectly ordinary slow-flush case and confuse oncall about whether the fs is actually unhealthy. Treat ENOENT as "not yet" and skip lastError tracking for that code; only EACCES / EISDIR / EIO / etc. are real diagnostics. A non-ENOENT error followed by ENOENTs preserves the original diagnostic rather than overwriting it with the benign "not yet" signal. Two new tests pin the behavior: ENOENT-only polling yields pollError=undefined, and an EACCES-then-ENOENT sequence still surfaces EACCES at the watchdog.
…unt API, AF_VSOCK) (#1623) fix(codeapi): harden NsJail seccomp policy per audit findings Seccomp audit against the running sandbox surfaced gaps where syscalls passed the BPF filter and reached the kernel even though they are attacker-interesting. The kernel currently rejects them due to missing caps / config, but that's not defense-in-depth — a kernel config change would re-open the surface silently. Block them at the BPF layer. Additions to the KILL list: - setns: namespace-join (other side of the unshare surface we already block). Audit observed it returning EINVAL from the kernel, not SIGSYS, meaning seccomp never saw it. - move_mount, open_tree, fsopen, fsmount, fspick: the Linux 5.2+ new mount API. Orthogonal to mount(2), which we already block; the new API can replicate bind mounts and would slip through. - settimeofday, adjtimex, clock_adjtime, syslog: kernel time mutation and ring buffer. - ioperm, iopl, modify_ldt, lookup_dcookie (x86 only): ring-0-adjacent surfaces and a deprecated profiling syscall. socket(domain) filter extension: - AF_VSOCK (40) added alongside AF_INET/AF_INET6/AF_NETLINK/AF_KEY/ AF_RXRPC/AF_ALG. The audit observed a VSOCK socket() succeed and connect() hang instead of returning ENETUNREACH — vsock reaches the host hypervisor on KVM-based runners and should not be reachable from sandboxed code. Implementation notes: - All new syscalls get explicit #defines in nsjail.ts so the policy does not rely on the symbol table of the NsJail-bundled Kafel snapshot. Arch-specific symbols (ioperm/iopl/modify_ldt, lookup_dcookie) are gated behind a process.arch check; an empty arch slot is filtered before the policy is joined so arm64 doesn't ship a blank Kafel line. - The audit's io_uring finding is already addressed by the existing ERRNO(1) clause for io_uring_setup/enter/register — no change there. Verification: - bun test src/nsjail.test.ts -> 9 passes (7 new policy-regression tests + 2 pre-existing argument tests). Tests cover: setns KILLed, new mount API KILLed with explicit NRs defined, AF_VSOCK in the socket(domain) clause, defense-in-depth batch KILLed, x86-only syscalls arch-gated, regression guard on the previously-blocked surface, and no blank Kafel lines on arm64. - Built sandbox-runner image and ran the rendered policy through `nsjail --seccomp_string` under chroot. NsJail's preparePolicy() accepted the policy and progressed to runChild() (which fails on unrelated docker privilege grounds — irrelevant here). Confirmed with a control case using an invalid identifier, which fails at preparePolicy() with "Undefined identifier" as expected. - Full end-to-end smoke against the docker-compose stack with KVM is the reviewer's step; this environment has no /dev/kvm. Adds services/codeapi/api/scripts/dump-seccomp-policy.ts so anyone can re-render the policy out-of-band and re-run the chroot-nsjail validation without spinning up the full stack. Lives under scripts/ so the build never picks it up.
* fix(codeapi): harden tool-call socket proxy
* fix(codeapi): align tool socket cap with concurrency
* fix(codeapi): cap active tool socket requests
* fix(codeapi): exit tool socket proxy on early signal
* fix(codeapi): bound tool socket active lifetime
* fix(codeapi): run tool-call socket proxy under Node, not Bun
The proxy's DoS defenses (maxConnections, idleSocketTimeoutMs,
activeSockets accounting) all depend on `server.on('connection', ...)`
firing and `socket.setTimeout` reliably arming on accepted unix-socket
connections. Empirically, Bun 1.3.x:
- Never fires `'connection'` on http.createServer (probed: 0 events
for 5 client connects). The accept-time tracking that gates
maxConnections silently no-ops; activeConnections() always reads 0;
the existing in-process tests pass vacuously.
- Does not honor http.Server.headersTimeout for unix-socket clients.
- Bun.serve idleTimeout does not close unix-socket TCP that has never
sent bytes.
- Bun.listen socket.timeout works but treats client.end() as full
close, breaking HTTP request/response semantics.
End-to-end probe of THIS branch's proxy hosted by Bun:
200 silent connections opened -> 0 closed after 3.5s
(200 FDs pinned for the audit's 60s headersTimeout window).
Same proxy hosted by Node 22 inside Linux Docker:
100 silent connections attempted -> 64 admitted (cap enforced) ->
0 alive after 4s (idleSocketTimeoutMs=2000ms enforced).
Concurrent legit POST during a 32-socket storm: 200 in 3ms.
The proxy code itself does not change: this commit only switches the
runtime so its existing logic actually executes as designed.
- Dockerfile: COPY /usr/local/bin/node from node:22-slim into the
sandbox-build stage (~30 MB; smaller than apt installing nodejs).
- api/package.json `build`: also emits the proxy as a Node-target
CommonJS bundle at .build/tool-call-socket-proxy.cjs via bun build.
- api/src/entrypoint.sh: runs node on the built bundle instead of
bun on the .ts source.
- Adds tool-call-socket-proxy-runtime.test.ts: spawns the BUILT
bundle as a real `node` subprocess, then drives it via raw unix-
socket clients. Verifies actual FD reclamation under the silent-
connect storm and 429 backpressure under the active-flood storm.
Skipped on Windows (unix-socket EACCES quirk in temp dirs;
production is Linux) and when `node` or .build/ is missing.
- .gitignore: ignore services/codeapi/api/.build/ (build artifact,
produced at image build time).
* fix(codeapi): close remaining proxy circumvention vectors
Threat-modeled the proxy after the runtime switch to Node and found
five attacker-reachable surfaces that the previous shape did not
fully close. Each is now both fixed in the proxy and pinned with a
runtime regression test that spawns the BUILT bundle as a real Node
subprocess (so the assertion exercises production parity, not Bun
in-process behavior).
1. HTTP request smuggling (CL+TE). A request carrying both
Content-Length and Transfer-Encoding is now rejected with 400
before any state is allocated, per RFC 7230 §3.3.3. Without this
the proxy and upstream could disagree on body boundary, letting
an attacker prepend bytes to a synthesized "second request" that
bypasses the path filter.
2. Hop-by-hop header pollution. Previously only `proxy-connection`
was stripped from the forwarded request; `Connection`, `Keep-
Alive`, `Upgrade`, `TE`, `Trailer`, `Transfer-Encoding`, `Proxy-*`,
and even `Host` were forwarded verbatim. RFC 7230 §6.1 forbids
forwarding hop-by-hop headers — a proxy MUST strip them. Also
strips them on the response path so an upstream that emits
Connection: keep-alive cannot keep our short-lived sandbox
connection open, and the sandbox sees only the proxy's chosen
`Connection: close`.
3. Slow-loris body upload. socket.setTimeout / req.setTimeout /
server.requestTimeout are all idle-based or (per Node 22 probe)
simply do NOT fire mid-body for unix-socket clients. A malicious
client dripping one byte every 800ms could stretch upload to
activeRequestTimeoutMs (default 30s) and pin a forwarder slot.
Fixed with an explicit setTimeout that fires unconditionally
`requestBodyTimeoutMs` after the request handler runs and is
cleared once the body is received.
4. Upstream pinning by partial-body forwarding. The previous shape
used `req.pipe(upstream)`, so the upstream's request handler
fired as soon as the FIRST chunk arrived — meaning slow-loris
pinned an upstream FD even when the proxy bounded its own
socket lifetime. Now the proxy buffers the body to memory
(capped at maxBodyBytes), only opens the upstream connection
when the full body has been received, and writes it in one
shot. Tool-call payloads are small JSON, so the buffering
latency is negligible.
5. Honest-oversize request hitting the upstream. A client that
declares Content-Length > maxBodyBytes used to start streaming
bytes through the pipe before the runtime byte counter fired.
With buffering this no longer matters, but a Content-Length
precheck still rejects with 413 immediately, before opening
any upstream connection at all.
Plus a new test for HTTP/2 binary framing — the proxy must reject
unknown protocol bytes cleanly without crashing, since a crash would
brick every subsequent tool call until the runner restarts.
Test coverage now includes 10 runtime tests against the BUILT bundle:
forwarding, silent-storm DoS, post-storm recovery, smuggling,
hop-by-hop, slow-loris, HTTP/2 framing, oversize body, pipelining
path-filter integrity, and active-flood 429s. All pass under Node
22 inside Linux Docker; existing in-process tests (7) also still
pass for a total of 17/17.
The pipelining test was rewritten: the previous assertion that
exactly ONE upstream call happens was unrealistic (node:http may
parse and dispatch a second pipelined request before the first
response's Connection: close fires). The security property we
actually want is that the path filter holds — both pipelined
requests still go through it, so /admin (or any non-/tool-call path)
NEVER reaches upstream. The test now asserts that.
* fix(codeapi): enforce proxy runtime regressions in CI
* fix(codeapi): defensive return after process.exit in shutdown + SIGTERM-during-init test
* fix(codeapi): close inherited FDs in spec-guard and hide PTC route
The audit's "permanent sandbox brick" reproduces because spec-guard
execvp()'s the user command without closing FDs >= 3. Anything the
runner / proxy / NsJail leaks (a 500-connection stress test against
/tmp/tcs.sock is sufficient) inherits into the next user process. The
victim is the dynamic loader on subsequent /api/v2/execute calls:
python3: error while loading shared libraries: libc.so.6: cannot
close file descriptor: Error 24
spec-guard is the unconditional execvp() choke point for every
sandboxed command, so closing inherited FDs there is the cheapest
containment regardless of which producer leaked them. Three fallbacks:
close_range(3, ~0U, 0) (Linux 5.9+; one syscall, fastest), then
/proc/self/fd walk (only closes actually-open FDs), then a
RLIMIT_NOFILE-bounded loop. Preserves stdin/stdout/stderr.
Also generalize the egress-gateway /tool-call missing-PTC-headers
response. Returning `400 {"error":"Missing required PTC headers"}`
told a sandbox attacker probing paths exactly which path was the
real route and that it gated on header-based auth — which is the
"you found the right path, now figure out the auth header names"
step in the audit's attack chain. Match the generic 404 every other
unknown path returns; keep server-side log for ops visibility.
End-to-end verification against the KVM-launched stack (libkrun
microVM, baked sandbox-runner, real Python 3.14 / Node 24 / Bash 5.2):
step 1: open 64 socket FDs in a first job -> opened 64
step 2: launch a second job, report inherited FDs -> fd_count=4
step 3: post-storm recovery for python/node/bash -> all code=0,
no Errno 24
/tool-call probed with no headers -> 404 Not found
(was 400 + leak)
The regression test (services/codeapi/tests/specguard_fd_isolation.py)
runs the full storm -> fd-count -> multi-language recovery scenario
through the live runner; it specifically asserts `Error 24` does not
appear in stderr.
* fix(codeapi): make /tool-call 404 wire-indistinguishable from unknown paths
Two follow-up findings on top of the prior commit. Both are about the
PTC route still being fingerprintable, and one is about the regression
test silently degrading.
(1) Body matched but headers leaked. The proxy's own 404 emits the
minimal `Content-Type: text/plain` + `Connection: close` framing,
while the egress-gateway 404 (returned for /tool-call when PTC
headers are missing) carried Express's auto-frame: ETag,
X-Request-ID, lowercase header casing, charset=utf-8,
Content-Length, etc. Even though I'd matched the 9-byte body
`not found`, a sandbox attacker could still tell which path was
the real route by reading the headers.
Fixed in the proxy: when the upstream returns 404, replace the
response with the proxy's own 404 framing instead of forwarding
upstream's. The proxy already emits the desired generic shape for
unknown paths; reusing that path makes /tool-call wire-identical.
Legitimate PTC traffic (SDK-formed) never receives 404 from the
upstream because the SDK always sends the required headers, so
the masking is invisible to the happy path.
The egress-gateway response is also lowercased (`not found`) to
match the proxy's body byte-for-byte even if a future change
bypassed the proxy mask.
(2) The storm-job assertion in the regression test was `if "opened"
not in stdout`, which would have passed even if the storm opened
0 sockets — the test would silently stop exercising the FD-
poisoning preconditions and report a vacuous PASS. The script
now parses the JSON status line, requires `opened >= 32`, and
surfaces any OSError from the storm so a future regression in
socket-cap behavior shows up immediately.
Verification against the live KVM stack:
step 4: PTC route opacity (byte-identical 404 wire responses)
[ok ] /tool-call response is byte-identical to /randomroute (118 bytes)
The proxy now strips upstream 404s entirely; both paths produce the
same Connection/Content-Type/Content-Length and 9-byte body, leaving
no structural signal an attacker can read.
* test(codeapi): use audit's exact 500-leak script in spec-guard regression
* ci(codeapi): compile + smoke spec-guard on every PR
spec-guard.c is otherwise only compiled by api/Dockerfile, which the
helm workflow builds AFTER merge to main. A syntax error or behavior
regression in the FD-cleanup logic would slip through PR CI today and
land in the post-merge image build.
Added a tiny pre-merge step on the existing api-unit-tests job:
- gcc -O2 -static -Wall -Wextra -Werror -o /tmp/spec-guard spec-guard.c
- parent inherits 5 extra FDs via shell exec redirections; spec-guard
must execvp /bin/ls /proc/self/fd and the child must see <= 5 FDs
(proves close_range / /proc/self/fd walk / rlimit fallback chain
actually runs and reduces inherited descriptors).
Mirrored locally in a gcc:bookworm container:
parent fds (with 5 extra inherited): 9
child fds via spec-guard: 4
OK
Adds ~3s to PR CI on the api-unit-tests job. Ubuntu runners ship with
gcc, no setup needed.
* fix(codeapi): keep storm sockets alive + filter PTC headers at proxy boundary
Codex review on #1639 caught two real bugs.
(1) The storm regression was vacuous. The Python loop did
`sk = socket.socket(...); sk.connect(...)` without retaining sk, so
CPython's refcount closed each socket immediately on the next
iteration. The job reported `opened == 500` but only ONE FD was ever
actually held — the FD-poisoning preconditions the test exists to
exercise were never created.
Fixed: append every socket to a `leaked` list, AND read
/proc/self/fd to assert `concurrent_fds >= MIN_STORM_OPENED`. The
second assertion is what defends against this class of
"vacuous-pass" bug going forward — relying on the loop counter
alone wasn't enough.
After the fix: storm reports `{"opened": 500, "concurrent_fds": 504}`
— sockets are genuinely held concurrently in /proc/self/fd.
(2) The proxy's blanket upstream-404 mask broke legitimate errors.
Reverted. The previous shape rewrote ALL upstream 404s into a
generic plain-text "not found" so route opacity held — but
`handleToolCall` legitimately returns 404 for "Session not found"
and similar conditions, and the SDK preamble does
`json.loads(raw)`. With the mask, that raises JSONDecodeError and
the SDK surfaces "Request error: Expecting value:" instead of the
actual reason.
Fixed: filter PTC-header presence at the proxy boundary BEFORE
forwarding. Missing-header probes get the canonical proxy 404 and
never reach upstream (route opacity preserved). Authenticated
requests pass through and any upstream 404 carries its structured
body intact (json-parseable by the SDK).
The proxy only checks PRESENCE; cryptographic validation still
happens at the upstream. An attacker probing with garbage header
values gets forwarded and receives the upstream's structured error
— they've already revealed themselves by sending non-empty headers,
so route opacity isn't the goal in that path.
Live verification on real KVM stack:
no_headers → 404 "not found" (proxy direct, byte-identical to /unknown)
bogus_headers → 400 {"error":"Egress token is malformed"}
(upstream structured response preserved)
unknown_route → 404 "not found" (proxy direct)
Plus existing regression suite still green:
step 1 storm: opened=500 concurrent_fds=504
step 2 inherited: fd_count=4
step 3 recovery: python/node/bash all code=0
step 4 opacity: /tool-call byte-identical to /randomroute (118 bytes)
* test(codeapi): supply PTC headers to test requests + new presence-filter regression
The proxy's new PTC-presence filter rejects /tool-call requests that
lack the SDK-supplied X-Execution-ID / X-Tool-Call-ID / X-Callback-Token
headers (canonical 404, never forwarded). The existing tests POST'd
without these headers and now legitimately get 404 — CI caught this on
my last push.
Fixed by:
- Adding a `PTC_HEADERS` constant in both test files
- Spreading it into all helper + inline POST sites that intend to
exercise the forwarding path
- Adding the headers to the raw-bytes POST in the slow-loris and
pipelined tests
- New explicit regression test that asserts the filter behavior:
no headers -> 404 + body 'not found' + ct text/plain + upstream
NEVER called; with headers (any values) -> 200 + upstream called
19 / 19 pass under Node 22 in Linux Docker.
* fix(codeapi): close duplicate-empty-header bypass of PTC presence filter
Codex caught this on the previous commit. Node joins duplicate request
headers with `, ` (per RFC 7230 §3.2.2). My naive `!req.headers[name]`
check on `X-Execution-ID:` sent twice with empty values produced the
truthy string `", "`, slipped past the gate, and let the attacker
reach the upstream — whose structured auth-error response would
defeat the route-opacity goal exactly as if the filter weren't there.
Fixed: hasPtcHeaderValue() normalises (joins arrays, strips commas
and whitespace) before the empty check. Single empty `X-Foo:`,
duplicate empty `X-Foo: / X-Foo:`, and array-form `['','']` all
collapse to `''` and get rejected. Real values like `'abc'`,
`'a, b'`, or `['x']` pass through.
Regression: new test sends each required PTC header twice with empty
values via raw bytes and asserts:
- response status is 404
- body is the canonical "not found"
- no Express framing (no X-Request-ID, no ETag) — proves we did NOT
reach the upstream
- upstream.calls() == 0
20 / 20 pass under Node 22 in Linux Docker.
…t (#1648)
* fix(codeapi): seccomp PID-1 + /mnt/data noexec + drop guest curl/socat
Four hardening follow-ups from the post-#1639 sandbox audit:
1. Seccomp ERRNO(1) now denies kill/tkill/tgkill/rt_sigqueueinfo/
rt_tgsigqueueinfo targeting PID 1 of the sandbox PID namespace
(the NsJail monitor). Also blocks kill(0) / kill(-1) fan-outs,
pidfd_open(pid==1), and pidfd_send_signal entirely (it can't be
filtered by destination since it takes a pidfd, and a pidfd to
PID 1 can be obtained via openat("/proc/1") on Linux 5.4+).
EPERM not SIGSYS so probes like os.kill(1, 0) get the kernel's
standard error shape.
2. /mnt/data is now bound via a per-job NsJail cfg overlay (base
sandbox.cfg + a rendered mount block) carrying noexec, nosuid,
nodev. Closes the audit's `chmod +x /mnt/data/x.sh && ./x.sh`
pattern. The CLI -B form has no syntax for those flags, hence
the per-job cfg file (cleaned up alongside the per-job log).
3. socat and curl removed from the sandbox guest rootfs (Debian
stage in api/Dockerfile and docker/Dockerfile.worker-sandbox).
Outer Fedora stages keep them — supervisor.sh / launcher need
them; the guest never did.
4. TOOL_CALL_SOCKET env var no longer exported into the jail. The
path is fixed at /tmp/tcs.sock and the runtime preamble now
references it as a literal, gated on os.path.exists() so the
TCP fallback for legacy deployments still works. Stays in
RESERVED_ENV_KEYS as defense-in-depth.
Cross-tenant impact of (1) is zero — clone_newpid already isolates
PIDs to the user's own namespace — but blocking the supervisor's
PID is principled defense.
* fix(codeapi): probe AF_UNIX socket at preamble import, not via path-existence
Codex review finding (P2) on PR #1648: the previous gate
`if os.path.exists('/tmp/tcs.sock')` was user-spoofable in legacy /
no-proxy deployments — user code could `open('/tmp/tcs.sock', 'w')`
and the next tool call would attempt the unix transport (which then
fails on connect) instead of falling back to TCP via _CALLBACK_URL.
Fix: render a one-shot probe that actually opens an AF_UNIX socket and
calls connect(). A real listener succeeds; a regular file returns
ENOTSOCK; a stale or absent path returns ECONNREFUSED / ENOENT — all
caught by the OSError except clause, falling through to TCP. The probe
runs at module load time of the preamble (before user code), and the
verdict is cached in `_USE_TOOL_CALL_SOCKET`, so user code planted
later cannot influence the gate.
Also fixes an unrelated latent bug introduced in the parent commit:
the comment containing backticks (`os.environ`) prematurely closed
the TS template literal in generatePreamble, breaking
`bun build src/api-server.ts`. CI didn't catch it because no test
imports the runtime export of preamble.ts. Replaced with plain text.
New tests in preamble.test.ts cover: probe is connect-based not
existence-based, probe runs at module load (cached, not per-request),
TCP fallback path is preserved, env var stays out of the rendered
preamble.
* fix(codeapi): rate-limit tool-call socket connection accepts (kernel slab pressure)
Auditor reproduced an intermittent EMFILE failure pattern after a
500-AF_UNIX-connect()-in-50ms flood: dynamic-linker open() calls in
subsequent jobs (node loader, nsjail's own startup, python module
discovery) sporadically fail with errno 24, even though
/proc/sys/fs/file-nr shows the system at 0.1% fd utilization.
The errno is misleading — close()/open() in glibc's ld.so map several
distinct kernel paths to the same "cannot close file descriptor: Error
24" diagnostic. The actual exhausted resource is the in-VM kernel's
unix_sock + dentry slab. Each AF_UNIX accept allocates a unix_sock
struct (with associated dentry, inode, skb queues); the proxy's
existing maxConnections cap bounds the application-level activeSockets
set but not the rate at which the kernel creates these allocations.
On a memory-tight microVM, 500+ slab allocations in a single tick
exceeds the slab allocator's per-cgroup budget transiently, and the
next syscall that needs kmem (loader's openat() / mmap()) fails.
Fix: token-bucket rate limiter at the proxy's `connection` event.
Burst capacity = 64 (well above maxActiveRequests=16 to cover any
legitimate concurrent tool-call burst), refill = 20/sec sustained
(the SDK opens one connection per tool call; even an aggressive agent
rarely exceeds a handful per second). When the bucket is empty,
destroySocket fires synchronously so the connection never enters
activeSockets — the kernel still allocates briefly for the accept,
but the slab entry is reclaimed in the same event-loop tick instead
of accumulating with the rest of the burst.
Coverage:
- createTokenBucket: 4 unit tests (drain, refill, cap-doesn't-exceed-burst,
refill=0 freeze for determinism). Pure logic, runs under bun:test.
- Integration: 2 runtime tests spawning the built proxy as a real Node
subprocess (the bun:test in-process harness doesn't fire 'connection'
events — known limitation documented at the top of the runtime test
file). Verifies (a) 100-parallel-/tool-call burst with burst=8 drops
≥80%, with upstream call count bounded to accepted-only; (b) bucket
refill restores legitimate traffic after a flood.
- Operator-facing knobs: TCS_CONNECTION_RATE_BURST and
TCS_CONNECTION_RATE_REFILL_PER_SEC env vars on the proxy CLI.
- Observability: connectionRateTokens() / connectionRateDropped()
on the proxy handle for tests + future metrics export.
Doesn't change the existing connection / request / body caps —
those still bound the absolute resource ceilings. The rate limit
specifically targets the burst-accept slab-pressure mode.
* fix(codeapi): scale connection-rate defaults with SANDBOX_MAX_CONCURRENT_JOBS
Codex P2 finding on PR #1652: the new rate limiter defaulted to a fixed
20 accepts/sec even when maxConcurrentJobs was scaled up via env. So a
deployment with SANDBOX_MAX_CONCURRENT_JOBS=64 (whose maxConnections
auto-scales to 256 and maxActiveRequests to 64) would still see new
tool-call connections rejected past 20/sec once the 64-token burst
drained — and those rejections arrive as ECONNRESET (rate-limit drop)
instead of the expected structured 429 Retry-After response from
maxActiveRequests. Net effect: legitimate sustained workloads on
higher-capacity runners would silently lose connections.
Fix mirrors the existing defaultMaxConnections / defaultMaxActiveRequests
scaling shape:
defaultConnectionRateBurst() =
clamp(maxConcurrentJobs * 4, [64, 256])
defaultConnectionRateRefillPerSec() =
clamp(maxConcurrentJobs * 2, [20, 200])
Baseline 64/20 preserved when env is unset (no behavior change for the
common case). Small deployments floor at the baseline (a 4-job runner
gets 64 burst + 20/sec refill, not a tighter 16/8 limit). Large
deployments cap at 256/200 to bound resource use even under extreme
config.
The two helper functions are exported so the unit tests can probe them
directly. 4 new tests cover: unset env (baseline), small concurrency
(floor), mid concurrency (proportional scaling), and high concurrency
(ceiling cap).
15/15 tests pass in Linux.
…#1651)
* perf(codeapi): use node:child_process.spawn for NsJail launches
Bun.spawn() serializes internally on the spawn path under concurrency.
node:child_process.spawn() parallelizes cleanly. Empirical results on
the runner (api container, 8 in-flight, 2000 invocations):
workload Bun.spawn node:spawn speedup
/bin/sh -c "echo $i" 715 ms 315 ms 2.27x
/bin/sh -c "dd urandom | base64" (4KiB) 732 ms 585 ms 1.25x
Real per-job NsJail invocations carry stdout/stderr traffic so the
realistic speedup is ~25%. The 2.3x figure from trivial echo isolates
Bun.spawn's serialization overhead but isn't representative of
production load.
FD lifecycle is equivalent: both methods steady at baseline FD count
across 2000+ spawns, sequential and concurrent (validated independently
while investigating an audit FD-leak claim — ruled out for both spawn
APIs; the audit's EMFILE has another source).
Mechanical changes:
- Switch ReadableStream<Uint8Array> drain to Node Readable.on('data')
event listener; semantics identical (decoder.decode(..., stream:true),
same overflow-and-kill logic).
- Map the Node 'close' event (fires after all stdio closed) to the
previous Bun.spawn `proc.exited` promise. Signal-killed children
return null exitCode in Node; mirror Bun's 128+signum convention so
downstream signal decoding still works.
- Always stdio: ['pipe', 'pipe', 'pipe'] and end() stdin immediately
when no input is supplied — Node's 'ignore' would null out proc.stdin
and force a runtime null-check; ending the pipe sends EOF identically.
- Add a 'error' event listener on the child to log spawn-time errors
(EPIPE on early stdin death, etc.) instead of crashing the runner
via uncaughtException.
Tests (19 existing nsjail tests) all pass — buildArgs is unchanged;
the rendered cfg overlay and seccomp policy are unchanged.
* fix(codeapi): attach nsjail child 'error' listener synchronously
Codex P1 finding on PR #1651: spawn() reports launch failures (ENOENT,
EACCES, EAGAIN) via the ChildProcess's async 'error' event, but the
previous code attached the listener AFTER `await setupGate.runSetup(...)`.
During the setup-gate's log-marker poll (~tens of ms), the error event
can fire with no listener attached, which Node treats as an
uncaughtException and kills the runner process — converting a controlled
"failed to spawn nsjail" error into a hard crash. Reproducible by
pointing config.nsjail_path at a missing or non-executable binary.
Move the .on('error', ...) call INSIDE the spawn lambda so the listener
is attached synchronously the instant spawn() returns, before any await.
The setup-gate semantics (NsJail mount-setup serialization on the host-
shared /tmp/nsjail.0.root) are preserved because the gate holds the
chain during the listener attach too — listener attach is synchronous
and trivially within the gate-held window.
* fix(codeapi): short-circuit nsjail drain + throw on async spawn errors
Codex P1 follow-up on PR #1651: even with the 'error' listener attached
synchronously (prior commit), the spawn-failure path still hung the
execute() call. Empirically confirmed in the api container's Bun
runtime: an ENOENT spawn fires proc:error + proc:close(-2) + stdout:close
+ stderr:close — but NO 'end' events on the streams. drainStream() only
listened for 'end' and 'error', so it waited forever for an EOF that
never came, blocking the whole function past the caller's timeout.
Two changes:
1. drainStream() now resolves on 'end' OR 'close' (idempotent via a
resolved flag). 'close' is the universal "this stream is done"
signal — fires after normal end AND after abrupt destruction on
spawn failure. Verified live: 1ms drain resolution post-fix vs.
indefinite hang before.
2. Capture the async spawn error in `spawnError`, and after the drain
completes, throw a controlled `Failed to spawn nsjail: <errno>:
<message>` — same shape as the sync-failure path's existing throw.
Without this, execute() would return a NsJailResult with code:-2
(Linux negated errno) which downstream callers would interpret as a
sandboxed program exiting with an exotic numeric code.
Cleanup of the per-job cfg/log files happens in both throw paths so the
runner doesn't accumulate /tmp/nsjail-*.cfg / .log files on spawn
failure.
Tests still pass (19/19 in nsjail.test.ts); typecheck clean. The hang
itself isn't unit-testable without a real spawn — verified via a
focused live probe in the api container that exercises the exact
drain/error-event pattern.
* fix(codeapi): abort setup-gate on async spawn failure (kills 1500ms latency tax)
Codex P1 follow-up (round 3) on PR #1651. Two related concerns:
(1) drainStream could miss the stream 'close' event when the gate's
watchdog stalled long enough for close to fire before drainStream's
listeners were attached. Bun lazily re-emits close to late listeners
so the symptom is masked there, but the behavior is runtime-dependent
and not safe to rely on.
(2) Every async spawn failure (ENOENT for missing nsjail_path,
EACCES for non-executable, EAGAIN for fork pressure) paid the FULL
watchdogMs (default 1500ms) while holding the serialized setup gate.
Queue-wide latency amplification: under a misconfigured runner or
fork-pressure burst, every queued job stalled by 1.5s per failed
spawn in front of it.
Fix is layered:
setup-gate: pollForMarker accepts optional AbortSignal, checks
signal.aborted at the top of each iteration, and returns
markerSeen=false immediately when aborted. Public API: runSetup()
takes an optional third arg.
nsjail.ts: create an AbortController whose signal is passed to the
gate. Wire `child.on('error', ...)` AND `child.on('close', ...)`
to abort the controller — both fire async on spawn failure (which
one depends on the runtime / posix_spawn path), aborting on either
guarantees release.
nsjail.ts: short-circuit BEFORE the drain AND the watchdog-warn
branch when spawnError is set, so the throw surfaces the actual
cause (ENOENT etc) and suppresses the misleading "watchdog fired"
warning + metric increment.
nsjail.ts: drainStream now does a synchronous stream.destroyed /
readableEnded check at attach time. Defense in depth — even if the
gate's abort race somehow lets close fire before drainStream
attaches its listeners, the function resolves instead of hanging.
Verified live: simulated production-shape spawn-fail with abort path
returns at t=12ms instead of the previous t=1500ms — 125x improvement.
Tests: 30/30 pass (added 2 new setup-gate tests covering abort
semantics and the queue-no-stall end-to-end assertion).
* fix(codeapi): handle stdin EPIPE + close exitCode-resolution race
Codex round 5 on PR #1651 surfaced two genuine P1 bugs (both
distinct from the round-4 false positives — those are addressed at
line 321 already, see prior commits + rebuttal comment):
(1) Stdin write without 'error' listener crashes the runner. If
nsjail exits or closes stdin before/during our write, the kernel
sends EPIPE on the write end. Node emits this on the writable
stream — NOT on the ChildProcess, so the existing child 'error'
listener does not catch it. Without a stdin-side listener, EPIPE
becomes an uncaughtException that kills the api process.
Verified live in api container: write to stdin of an
already-exited /bin/sh -c "exit 0". Without listener: throws
uncaught. With listener: caught cleanly, write completes without
crash.
(2) proc.once('close') race on signal-killed children. drainStream
waits for stream 'close' events, which Node emits BEFORE
proc 'close'. By the time drainStream resolves, proc may have
ALREADY emitted its own 'close' event — and proc.exitCode stays
null on signal kill, so the old check `if (proc.exitCode !== null)`
misses it. Then `proc.once('close', ...)` attaches AFTER the event
fired and the listener never runs — execute() hangs forever.
The fix: check BOTH proc.exitCode and proc.signalCode in the
synchronous head of the Promise. Node populates both before the
'close' event fires, so a signal-killed child is always caught
on the first iteration (no race window).
Verified live in api container with SIGKILL'd child:
OLD pattern (exitCode-only check + close listener): HUNG at 478ms
NEW pattern (exitCode || signalCode + listener): caught at t=0ms
Triggered in production every time the output-overflow path
proc.kill('SIGKILL')s a child — the old code would have left
execute() pending until the per-job timeout (default 30s), with
the runner spending the time on a wedged JS promise instead of
surfacing the kill status.
Also consolidated the duplicated signal-name-to-number lookup into a
single SIGNAL_NUM_BY_NAME map derived from the module-top SIGNALS
constant, eliminating a maintenance hazard where the two lists
could drift.
Tests: 30/30 pass.
feat: add statsmodels to codeapi packages
* fix(codeapi): support parallel bash ptc calls * test(codeapi): cover parallel bash ptc live flow * fix(codeapi): compact bash ptc replay metadata * fix(codeapi): harden bash ptc background replay * fix(codeapi): broaden bash ptc batch detection * fix(codeapi): avoid global bash functrace for ptc * fix(codeapi): detect backtick bash ptc substitutions * fix(codeapi): force bash ptc emit after cmdsub * test(codeapi): cover bash ptc command substitution * fix(codeapi): narrow bash ptc replay lock * fix(codeapi): make bash replay signatures resilient * fix(codeapi): use bun for mac sandbox healthcheck * fix(codeapi): capture fast nsjail output * fix(codeapi): tighten bash replay matching * fix(codeapi): preserve raw legacy input hashes * fix(codeapi): cover delayed bash tool jobs * fix(codeapi): avoid waiting on non-tool bash jobs * fix(codeapi): ignore nonnumeric bash history keys * fix(codeapi): harden bash replay fallback matching
* feat(synthetic-tests): add codeapi_seccomp_probe synthetic check * fix(synthetic-tests): tighten codeapi_seccomp_probe coverage and AF_VSOCK expectation * fix(codeapi): require SIGSYS for AF_VSOCK seccomp probe * fix(codeapi): add comment after request from Danny --------- Co-authored-by: Danny Avila <danny@librechat.ai>
* fix(codeapi): harden sandbox socket seccomp * test(codeapi): add kernel surface synthetic check * test(codeapi): avoid ambiguous seccomp probes
* fix(codeapi): stabilize mac package builds Batch Bun package installs to avoid local builder memory spikes and keep the sandbox umount hardening enabled except for the macOS direct-run overlay. * fix(codeapi): validate bun package batch size
fix(codeapi): add column CLI to sandbox images Install bsdextrautils in CodeAPI sandbox images so bash scripts can format tabular output with column.
* fix(codeapi): preserve bash ptc sentinel through redirects * fix(codeapi): avoid bash ptc fd collisions * fix(codeapi): emit bash ptc sentinel from parent * fix(codeapi): include file utility in sandbox rootfs
* fix(codeapi): bind manifests to execute body * test(codeapi): cover manifest replay bypass * fix(codeapi): preserve manifest rollout compatibility
fix: suppress codeapi vsock enotconn teardown noise
* fix(codeapi): health check sandbox workspace * fix(codeapi): probe sandbox workspace allocation * fix(codeapi): harden sandbox health probe
fix(codeapi): boot baked sandbox runner from block root
fix: gate sandbox tool-call socket mount
* feat(codeapi): expose BullMQ queue depth metric * fix(codeapi): avoid stale queue depth metrics * fix(codeapi): observe timed-out queue metric promises
* Consolidate CodeAPI execution identity * Fix CodeAPI continuation namespace checks
chore(codeapi): reduce synthetic log volume
Prepares services/codeapi for one-way publishing to the public ClickHouse/code-interpreter repository: - scrub maintainer email from helm chart metadata - rename license -> LICENSE and make sub-README paths standalone-friendly - add services/codeapi/.github/workflows/ci.yml (inert in the monorepo, becomes root CI after subtree split) - add publish-codeapi.yml: subtree-splits services/codeapi on merges to main and force-pushes the export to code-interpreter via deploy key
Bumps [qs](https://github.com/ljharb/qs) to 6.15.2 and updates ancestor dependency [express](https://github.com/expressjs/express). These dependencies need to be updated together. Updates `qs` from 6.14.2 to 6.15.2 - [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md) - [Commits](ljharb/qs@v6.14.2...v6.15.2) Updates `express` from 4.22.1 to 4.22.2 - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/v4.22.2/History.md) - [Commits](expressjs/express@v4.22.1...v4.22.2) --- updated-dependencies: - dependency-name: qs dependency-version: 6.15.2 dependency-type: indirect - dependency-name: express dependency-version: 4.22.2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
Author
|
OK, I won't notify you again about this release, but will get in touch when a new version is available. If you change your mind, just re-open this PR and I'll resolve any conflicts on it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bumps qs to 6.15.2 and updates ancestor dependency express. These dependencies need to be updated together.
Updates
qsfrom 6.14.2 to 6.15.2Changelog
Sourced from qs's changelog.
Commits
9aca407v6.15.25e33d33[Dev Deps] update@ljharb/eslint-config21f80b3[Fix]stringify: skip null/undefined entries inarrayFormat: 'comma'+ `e...a0a81ea[Fix]stringify: use configureddelimiteraftercharsetSentinele3062f7[Fix]stringify: applyformatterto encoded key understrictNullHandling0c180a4[Fix]stringify: skip null/undefined filter-array entries instead of crashi...3a8b94a[Tests] add regression tests for keys containing percent-encoded bracket text96755ab[readme] fix grammara419ce5[Fix]parse: handle nested bracket groups and add regression tests3f5e1c5v6.15.1Updates
expressfrom 4.22.1 to 4.22.2Release notes
Sourced from express's releases.
Changelog
Sourced from express's changelog.
Commits
df0abc94.22.2836d3664.xupdate qs to 6.15.1, body-parser 1.20.5 (#7224)8d09bfefix: restore array parsing for req.query repeated keys (#7181)d39e8addeps: body-parser@~1.20.4 (#7021)efe85d9deps: qs@^6.14.1 (#6972)f62378e📝 add note to historyDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)You can disable automated security fix PRs for this repo from the Security Alerts page.