feat: add coding-agent sidecar integrations by willkill07 · Pull Request #63 · NVIDIA/NeMo-Flow

willkill07 · 2026-05-05T19:57:31Z

Overview

Adds the Rust nemo-flow-sidecar binary and coding-agent integration surface for observing Claude Code, Codex, Cursor, and Hermes agent activity through hook ingestion, transparent sidecar execution, passthrough LLM gateway routes, ATIF export, and OpenInference export. The branch now covers the sidecar runtime, installer behavior, agent-specific hook bundles/docs, Hermes-specific shell hook support, best-effort LLM/tool correlation, token metrics extraction, and CI/release plumbing for publishing the sidecar crate and binary artifacts.

I confirm this contribution is my own work, or I have the right to submit it under this project's license.
I searched existing issues and open pull requests, and this does not duplicate existing work.

Details

Adds the nemo-flow-sidecar workspace crate with CLI entry points for serve, run, install, and hook-forward.
Adds hook endpoints and adapters for Claude Code, Codex, Cursor, and Hermes, including Hermes YAML config merge support and runtime NEMO_FLOW_SIDECAR_URL handling for ephemeral sidecar runs.
Adds sidecar session management for agent, subagent, tool, LLM, ATIF, and OpenInference lifecycle events, with shared extraction of task/session IDs, parent relationships, tool call IDs, results, and sidecar metadata.
Adds passthrough gateway routes for OpenAI Responses, OpenAI Chat Completions, Anthropic Messages, Anthropic token counting, and model listing, plus best-effort correlation between gateway LLM activity and hook-reported tool calls.
Updates ATIF export behavior to preserve the unwrapped LLM request payload, extract provider-native usage/token metrics, include cached-token variants, and improve source call correlation.
Adds installable Claude Code and Codex hook/plugin bundles plus a Cursor hook bundle under integrations/coding-agents/; Hermes is configured through the sidecar installer rather than a static bundle.
Adds end-to-end sidecar documentation, per-agent setup guides, Rust API index notes, install docs, and integration overview links.
Wires the sidecar into workspace metadata, release guidance, local dependency publishing helpers, attribution data, Codecov configuration, path filters, GitHub Actions, GitLab CI, and sidecar release-binary artifact publishing.
Adds sidecar CLI tests plus coverage for adapters, config, gateway behavior, installer merge/planning, transparent launcher behavior, server responses, session/export behavior, and ATIF exporter changes.
Follow-up CI fixes remove the stale sidecar generated-doc toctree entry and make the transparent launcher env/exit-code test use a platform-valid fake agent on Windows.

Validation recorded for this PR:

cargo test -p nemo-flow-sidecar
PATH="/opt/homebrew/bin:$PATH" /opt/homebrew/bin/just docs
cargo fmt --all
cargo test -p nemo-flow-sidecar launcher::tests::run_starts_sidecar_injects_env_and_returns_agent_exit_code
Previous branch validation included just test-rust, just docs, and cargo clippy -p nemo-flow-sidecar --all-targets -- -D warnings.
Commit pre-checks for the follow-up fix passed with lychee skipped because it still fails on unrelated generated Rust API doc links in docs/reference/api/rust/_generated/...; cargo fmt, cargo clippy, and cargo check passed with PyO3 pointed at Python 3.12.
Latest PR head is f5f9a385, which includes the follow-up fix commit 23315c09 plus the current main merge; GitHub CI is rerunning for this head.

Where should the reviewer start?

Start with crates/sidecar/src/session.rs for lifecycle normalization and export behavior, then crates/sidecar/src/gateway.rs and crates/sidecar/src/server.rs for the HTTP/gateway surface. For installation and user-facing behavior, review crates/sidecar/src/installer.rs, docs/integrate-frameworks/coding-agent-sidecar.md, and integrations/coding-agents/README.md.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Closes NMF-57
Closes NMF-58
Closes NMF-59
Closes NMF-60

Summary by CodeRabbit

New Features
- Added a coding-agent gateway sidecar (nemo-flow-sidecar) with adapters, CLI/launcher/installer, server, and workspace/crate publication.
- Python LLM API: optional annotated_response and response_codec for richer end-event annotations.
Improvements
- Enhanced observability: request-level LLM payload capture and broader provider token-usage extraction; normalized response decoding behavior.
Documentation
- New sidecar guides, integration READMEs, install instructions, and README badge.
Tests
- Extensive unit/integration coverage for sidecar components.
Chores
- CI/workflow updates to build/publish sidecar and Codecov coverage component.

Signed-off-by: Will Killian <wkillian@nvidia.com>

coderabbitai · 2026-05-05T19:57:39Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Introduces a new nemo-flow-sidecar crate (server, adapters, gateway, session manager), installer/launcher CLI, Python llm_call_end annotation/codec support, CI/publishing and Codecov updates, ATIF exporter improvements, extensive tests, docs, and packaged coding-agent integration bundles.

Changes

Sidecar crate, API, CI, tests, and docs

Layer / File(s)	Summary
Core contracts and wiring `.github/...`, `.gitlab-ci.yml`, `Cargo.toml`, `codecov.yml`, `README.md`, `RELEASING.md`, `justfile`	Adds `crates/sidecar` to the workspace and workspace.dependencies; includes sidecar in GitHub/GitLab publish loops and adds a CI packaging job; updates Codecov components/ignore; extends CI path filters for `integrations/coding-agents/**`; updates README badges and RELEASING notes; updates justfile helper.
Core ATIF exporter `crates/core/src/observability/atif.rs`, `crates/core/tests/unit/atif_tests.rs`	Adds `llm_request` to `AtifStepExtra`; records request payload at LLM start and emits `None` for non-user finalized agent steps; refactors `extract_metrics` to handle provider `usage` shapes and adds helper extractors; tests updated.
Core LLM API / codecs `crates/core/src/api/llm.rs`, `crates/core/src/codec/traits.rs`	Adds `response_codec` to `LlmCallEndParams`; `llm_call_end` sanitizes the provider `response`, derives `annotated_response` from the codec when omitted, emits the end event even if decode fails, and returns a decode error afterward; codec docs clarify fallible decode semantics.
Python API extension `crates/python/src/py_api/mod.rs`, `python/nemo_flow/_native.pyi`, `python/nemo_flow/llm.py`, `crates/python/tests/`, `python/tests/`	Extends Python `llm_call_end` to accept optional `annotated_response` and `response_codec`; adds helpers to resolve/validate codec and annotated-response inputs; refactors execute/stream execute paths to use shared helper; updates stubs and tests to cover decoding, sanitized-payload decoding, and error-with-emitted-end-event semantics.
Sidecar manifest & models `crates/sidecar/Cargo.toml`, `crates/sidecar/src/model.rs`	Adds `crates/sidecar` Cargo manifest and workspace dependency; defines `AgentKind`, `NormalizedEvent`, and typed event payload structs (SessionEvent, SubagentEvent, LlmHintEvent, LlmEvent, ToolEvent).
Adapters & normalization `crates/sidecar/src/adapters/*`, `crates/sidecar/src/adapters/mod.rs`	Implements vendor adapters (Claude Code, Codex, Cursor, Hermes) and shared adapter utilities: `AdapterOutcome`, classification rules, session-id extraction, metadata builders, normalization helpers; adapters return vendor-shaped responses while producing normalized events.
Config, CLI, and error mapping `crates/sidecar/src/config.rs`, `crates/sidecar/src/error.rs`	Implements CLI models (server/run/install/hook-forward), TOML discovery + deep-merge, environment overlays, header-derived per-session config parsing, and `SidecarError` with `IntoResponse` mapping to structured JSON and HTTP status codes.
Server and gateway `crates/sidecar/src/server.rs`, `crates/sidecar/src/gateway.rs`	Adds Axum router with `/healthz`, per-agent `/hooks/*` endpoints that adapt payloads and apply normalized events; gateway passthrough for OpenAI-style routes with buffered and streaming handling, streaming preview/truncation, `StreamingLlmGuard` for client-drop cleanup, and `/v1/models` proxy; header filtering and provider route mapping included.
Session manager, correlation, and ATIF `crates/sidecar/src/session.rs`	Implements `SessionManager` for in-memory session lifecycle management, lazy root-scope creation, LLM/tool/subagent ownership resolution via headers/pending hints/sticky ownership, hint scoring and matching, tool-hint extraction from provider responses, ATIF trajectory writing, and observer flush/cleanup.
Installer, launcher, and CLI entrypoint `crates/sidecar/src/installer.rs`, `crates/sidecar/src/launcher.rs`, `crates/sidecar/src/main.rs`	Adds `install` planning/merge/write with idempotent merging and backups; `hook-forward` stdin->HTTP forwarding with fail-open/closed modes; `run` launcher that injects sidecar wiring for various agents (temporary plugin/hook patching, env/argv injection, health polling); adds async CLI entrypoint and dispatcher.
Tests `crates/sidecar/tests/*/`, `crates/core/tests/`, `crates/python/tests/`, `python/tests/*`	Adds wide test coverage: adapters, config resolution, gateway streaming/headers/models, server endpoints, session lifecycles and correlation, installer merge/idempotency, launcher prepare/restore/health, CLI hook-forward behaviors, Python codec decode/sanitization/failure semantics, and ATIF content validation.
Docs & integrations `docs/`, `integrations/coding-agents/`	Adds sidecar getting-started and per-agent guides, integration bundles (Claude Code, Codex, Cursor, Hermes) with hooks and READMEs, updates Rust API reference to list `nemo-flow-sidecar`, and documents install-from-source and index navigation links.

Sequence Diagram(s)

sequenceDiagram
  participant Agent as Coding Agent
  participant Sidecar as Sidecar Server
  participant Upstream as Provider API
  participant Sess as SessionManager
  Agent->>Sidecar: POST /hooks/{agent} (hook JSON)
  Sidecar->>Sess: apply_events(normalized)
  Agent->>Sidecar: POST /v1/chat/completions
  Sidecar->>Sess: start_llm(request, headers)
  Sidecar->>Upstream: forward request
  Upstream-->>Sidecar: response/stream
  Sidecar->>Sess: end_llm(response/meta)
  Sidecar-->>Agent: proxied response/vendor hook reply

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Signed-off-by: Will Killian <wkillian@nvidia.com>

- Added Hermes as a first-class sidecar agent: --agent hermes, config support, command inference for hermes / hermes-agent, and AgentKind::Hermes. - Added a dedicated Hermes hook endpoint: /hooks/hermes now routes Hermes shell-hook payloads into the sidecar session manager. - Added a Hermes adapter: Maps Hermes lifecycle, tool, and subagent shell hooks into normalized NeMo Flow sidecar events. - Added Hermes installer support: nemo-flow-sidecar install hermes writes/merges .hermes/config.yaml hook config using YAML instead of JSON/TOML. - Added dynamic sidecar URL handling for Hermes: Hermes installed hooks prefer runtime NEMO_FLOW_SIDECAR_URL, so nemo-flow-sidecar run --agent hermes can use an ephemeral sidecar URL without reinstalling hooks. - Preserved Hermes hook consent semantics: The runner exports the dynamic sidecar URL but does not auto-enable HERMES_ACCEPT_HOOKS. - Improved ATIF LLM request fidelity: ATIF user steps now preserve the full unwrapped LLM request payload under extra.llm_request. - Improved ATIF token metrics extraction: Supports provider-native usage objects in addition to NeMo Flow token_usage, including cached-token variants from OpenAI/Anthropic-style payloads. - Expanded shared adapter extraction: Adds support for Hermes-friendly payload fields like task_id, parent_session_id, extra.tool_call_id, and extra.result. - Added test coverage: New/updated tests cover Hermes adapter mapping, config parsing, installer generation/merge behavior, runtime launcher behavior, server hook response shape, and ATIF request/usage export. Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com> Signed-off-by: Will Killian <wkillian@nvidia.com>

Signed-off-by: GSD Agent <bbednarski@nvidia.com>

Signed-off-by: Will Killian <wkillian@nvidia.com>

Signed-off-by: GSD Agent <bbednarski@nvidia.com>

Signed-off-by: Will Killian <wkillian@nvidia.com>

Signed-off-by: Will Killian <2007799+willkill07@users.noreply.github.com>

copy-pr-bot · 2026-05-07T20:13:48Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

…egrations' into wkk_coding-agent-sidecar-integrations # Conflicts: # crates/sidecar/src/model.rs # crates/sidecar/src/session.rs # crates/sidecar/tests/coverage/installer_tests.rs # crates/sidecar/tests/coverage/session_tests.rs

Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>

…decar-integrations Signed-off-by: Will Killian <wkillian@nvidia.com> # Conflicts: # ATTRIBUTIONS-Rust.md # Cargo.lock

coderabbitai

Actionable comments posted: 15

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

codecov.yml (1)
46-67: ⚠️ Potential issue | 🔴 Critical | ⚖️ Poor tradeoff

Duplicate path configuration in coverage components.

The path crates/sidecar/src appears in both the rust_runtime component (line 51) and the new sidecar_runtime component (line 61). This duplication will cause the same source files to be tracked under two separate components, which can lead to confusing coverage reports and potentially inflated metrics.

Choose one of the following:

If the sidecar should be part of the unified Rust runtime component, remove the standalone sidecar_runtime component (lines 58-67) and keep only the path addition to rust_runtime.

If the sidecar should be tracked separately, remove crates/sidecar/src from the rust_runtime paths (line 51) and keep only the sidecar_runtime component.
Proposed fix: Track sidecar separately
     - component_id: rust_runtime
       name: Rust Runtime
       paths:
         - "crates/core/src"
         - "crates/adaptive/src"
-        - "crates/sidecar/src"
       statuses:
         - type: project
           target: 95%
           threshold: 0.5%
           base: auto
           if_ci_failed: error
     - component_id: sidecar_runtime
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@codecov.yml` around lines 46 - 67, The coverage config duplicates the path
"crates/sidecar/src" in component_id rust_runtime and sidecar_runtime; to track
the sidecar separately, remove "crates/sidecar/src" from the paths array of the
rust_runtime component (the entry under component_id rust_runtime) and leave the
sidecar_runtime component (component_id sidecar_runtime) intact so only
sidecar_runtime owns that path.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/ci-path-filters.yml:
- Line 15: Remove the 'integrations/coding-agents/**' entry from the shared
section of the CI path filters: locate the 'shared' array and delete the
'integrations/coding-agents/**' line so documentation/config changes don't
trigger shared-component CI; leave the additions under the 'docs' and 'rust'
sections intact.

In `@crates/python/src/py_api/mod.rs`:
- Around line 644-650: The annotated-response decoding is happening before the
sanitized end-path and currently swallows codec errors with .ok(); move the
response codec decoding (py_llm_response_codec /
codec.decode_response(&response_json)) into the sanitized end-path (after
core_llm_api::llm_call_end or after you sanitize response_json) so annotations
are derived from the already-sanitized payload, and stop dropping
errors—propagate decode failures back to Python instead of using .ok() (e.g.,
return Err or ? from py_annotated_llm_response decoding) so callers see codec
decode errors.

In `@crates/sidecar/src/error.rs`:
- Around line 36-50: The match in into_response consumes self causing a
use-after-move when calling self.to_string(); fix by computing let message =
self.to_string() before the match and then match on &self (or match self by
reference) to get the StatusCode variants (StatusCode::BAD_REQUEST, BAD_GATEWAY,
INTERNAL_SERVER_ERROR) without moving; then build the body with
Json(json!({"error":{"message": message, "type":"nemo_flow_sidecar_error"}}))
and return the response.

In `@crates/sidecar/src/gateway.rs`:
- Around line 66-68: The current call axum::body::to_bytes(body, usize::MAX)
reads an unbounded request body; change it to use a fixed maximum (e.g., const
MAX_BODY_BYTES: usize = 10 * 1024 * 1024) and call axum::body::to_bytes(body,
MAX_BODY_BYTES) so body_bytes is limited, keeping the existing map_err to
SidecarError::InvalidPayload(error.to_string()) to translate limit/extract
errors; pick a limit in the 10–50 MiB range and name the constant (e.g.,
MAX_BODY_BYTES) so it’s easy to adjust.

In `@crates/sidecar/src/installer.rs`:
- Around line 289-296: The current use of
std::fs::read_to_string(...).unwrap_or_default() in the PlannedFile creation
(the call feeding merge_codex_config) silently treats any I/O error as an empty
config; change this to only treat NotFound as an empty string and return the
error for any other failure. Replace the unwrap_or_default behavior so that
before calling merge_codex_config you attempt to read the file and: if Ok(s) use
s, if Err(e) and e.kind() == std::io::ErrorKind::NotFound use String::new(),
otherwise propagate the error (convert to the function's error type) so the
installer fails closed; update the code paths around merge_codex_config and
PlannedFile accordingly.

In `@crates/sidecar/src/server.rs`:
- Around line 64-70: The shared HTTP client is created with Client::new() in
router (AppState.http) and lacks timeouts; replace it with a configured
reqwest::Client built via Client::builder() setting connect_timeout and overall
timeout (and optionally read timeout) before .build(), using values from
SidecarConfig (e.g., config.request_timeout or similar) so the shared client
does not hang indefinitely when proxying to upstream LLM providers; update the
router function to construct the client via
Client::builder().connect_timeout(...).timeout(...).build() and assign that to
AppState.http.

In `@crates/sidecar/src/session.rs`:
- Around line 147-154: apply_events is creating a new Session for any unknown
session_id (via sessions.entry(...).or_insert_with(|| Session::new(...))) which
causes terminal-only retries to recreate sessions after cleanup; change the
logic so you do not allocate a new Session when the incoming event is a
terminal/cleanup-only hook (i.e., detect terminal events via the event type or a
helper like event.is_terminal() or similar) — only insert Session::new when the
event is a genuine start/agent-scoped event, otherwise skip/continue and do not
call Session::new or session.apply; adjust the code paths around
sessions.entry/session.apply and any calls to ensure_agent_started so terminal
retries are ignored instead of recreating sessions.
- Around line 274-306: The LLM handle created in start_llm is never recorded in
the session's registry, so end_agent only draining self.llms can miss gateway
LLM events; after creating the ActiveLlm (the value constructed with stack,
handle, session_id, owner_subagent_id) persist it into self.llms (e.g.
insert/push keyed by session/owner or into the same collection end_agent drains)
and ensure the lifecycle is symmetric by removing it when the LLM completes;
update the start_llm function (and the other analogous LLM-start sites that call
llm_call/LlmCallParams) to register the ActiveLlm into self.llms immediately
after construction so end_agent will see and clean up in-flight gateway LLMs.
- Around line 1050-1060: The write_atif function uses an externally controlled
session_id directly in directory.join(format!("{session_id}.atif.json")),
enabling path traversal; sanitize/validate session_id in write_atif: ensure it
contains no path separators ('/' or '\\'), no components like "." or "..", not
empty, and contains only a safe whitelist of filename characters (e.g.,
alphanumerics, '-', '_'); reject or canonicalize any session_id that fails
validation, or fallback to an encoded/sanitized filename (e.g., percent-encode
or use sanitize_filename) before building path and writing; update write_atif to
perform this check/transform on session_id prior to calling directory.join to
guarantee the export cannot escape the configured directory.
- Around line 1147-1159: Filter OpenAI response items by their type before
treating them as tool calls: in the block that obtains response.get("output")
and iterates items (the code that calls push_tool_hint), check each item's
"type" field and only call push_tool_hint for tool-call types (e.g.,
"function_call" or whatever tool-specific type your OpenAI responses use),
mirroring the approach in collect_anthropic_tool_hints(); this prevents non-tool
messages that happen to have ids from generating hints and contaminating pending
hints.

In `@crates/sidecar/tests/coverage/gateway_tests.rs`:
- Around line 269-326: The test streaming_llm_guard_closes_on_drop has a race
because it sleeps 100ms after dropping StreamingLlmGuard::new and may not wait
for async cleanup; replace the fixed sleep with a robust wait: poll (with a
small sleep/backoff) for the expected side-effect (e.g., existence/contents of
temp.path().join("drop-session.atif.json")) for up to a sane timeout (e.g.,
several seconds) and only then call sessions.apply_events and assert; this
avoids changing production code and makes the test deterministic under CI load.
Include a short comment why polling with a timeout is used.

In `@crates/sidecar/tests/coverage/server_tests.rs`:
- Around line 364-370: The spawned axum server task (created with tokio::spawn
in the block using TcpListener::bind and axum::serve) is left running for the
test lifetime; change the spawn to capture the returned tokio::task::JoinHandle
(e.g., let handle = tokio::spawn(...)) and ensure it is aborted/awaited at test
end (or wrap the handle in a small RAII/drop type that calls handle.abort() on
drop) or alternatively use a cancellation mechanism (like a CancellationToken
passed into the axum service) and trigger it before the test exits so the
axum::serve task is cleanly shut down; locate the spawn of axum::serve and
replace it with a cancellable/abortable handle for proper shutdown.

In `@integrations/coding-agents/codex/hooks/hooks.json`:
- Line 1: Add the repo-approved SPDX metadata to the JSON manifest by inserting
the standard SPDX field for JSON configs at the top of hooks.json (e.g., a
top-level "spdx" or "spdxLicense" metadata key containing
"SPDX-License-Identifier": "Apache-2.0" per the project's JSON SPDX convention);
update the manifest in integrations/coding-agents/codex/hooks/hooks.json to
include that SPDX entry (or follow the documented exception path if JSON SPDX
headers are disallowed) so the file explicitly records the Apache-2.0 license.

In `@integrations/coding-agents/cursor/.cursor/hooks.json`:
- Line 1: Add the project SPDX metadata to the top-level of the new Cursor hook
JSON by inserting the project-approved SPDX field (e.g., a top-level "spdx" or
"SPDX-License-Identifier" key) with value "Apache-2.0" so the file
integrations/coding-agents/cursor/.cursor/hooks.json contains explicit SPDX
licensing metadata; ensure the key is the first entry in the JSON object so
tools will detect it and follow the repository's SPDX convention for JSON config
files.

In `@python/tests/test_builtin_codecs.py`:
- Around line 188-269: Add a negative-path test that calls llm.call_end with a
malformed annotation/codec input and asserts observability doesn't silently
degrade: register a subscriber (like in
test_manual_call_end_response_codec_attaches_annotation), invoke llm.call to get
a handle, then call llm.call_end with either an invalid annotated_response
(e.g., a non-dict value) or a response_codec/OpenAIChatCodec() plus a malformed
response payload, and assert that an error event is emitted (kind == "error" and
category == "llm") and/or the end scope event still appears but its
annotated_response is None; use the existing patterns
(subscribers.register/deregister, llm.call, llm.call_end) and name the test to
reflect the failure case.

---

Outside diff comments:
In `@codecov.yml`:
- Around line 46-67: The coverage config duplicates the path
"crates/sidecar/src" in component_id rust_runtime and sidecar_runtime; to track
the sidecar separately, remove "crates/sidecar/src" from the paths array of the
rust_runtime component (the entry under component_id rust_runtime) and leave the
sidecar_runtime component (component_id sidecar_runtime) intact so only
sidecar_runtime owns that path.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: d3c184c5-8103-4b64-b847-abd89d5ffff8

📥 Commits

Reviewing files that changed from the base of the PR and between b62912e and dec9bfd.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (59)

.github/ci-path-filters.yml
.github/workflows/ci.yaml
.github/workflows/ci_rust.yml
.gitlab-ci.yml
ATTRIBUTIONS-Rust.md
Cargo.toml
README.md
RELEASING.md
codecov.yml
crates/core/src/observability/atif.rs
crates/core/tests/unit/atif_tests.rs
crates/python/src/py_api/mod.rs
crates/python/tests/coverage/py_api_coverage_tests.rs
crates/sidecar/Cargo.toml
crates/sidecar/src/adapters/claude_code.rs
crates/sidecar/src/adapters/codex.rs
crates/sidecar/src/adapters/cursor.rs
crates/sidecar/src/adapters/hermes.rs
crates/sidecar/src/adapters/mod.rs
crates/sidecar/src/config.rs
crates/sidecar/src/error.rs
crates/sidecar/src/gateway.rs
crates/sidecar/src/installer.rs
crates/sidecar/src/launcher.rs
crates/sidecar/src/main.rs
crates/sidecar/src/model.rs
crates/sidecar/src/server.rs
crates/sidecar/src/session.rs
crates/sidecar/tests/cli_tests.rs
crates/sidecar/tests/coverage/adapters_tests.rs
crates/sidecar/tests/coverage/config_tests.rs
crates/sidecar/tests/coverage/gateway_tests.rs
crates/sidecar/tests/coverage/installer_tests.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/tests/coverage/server_tests.rs
crates/sidecar/tests/coverage/session_tests.rs
docs/getting-started/installation.md
docs/getting-started/rust.md
docs/index.md
docs/integrate-frameworks/about.md
docs/integrate-frameworks/coding-agent-claude-code.md
docs/integrate-frameworks/coding-agent-codex.md
docs/integrate-frameworks/coding-agent-cursor.md
docs/integrate-frameworks/coding-agent-hermes.md
docs/integrate-frameworks/coding-agent-sidecar.md
docs/reference/api/rust/index.md
integrations/coding-agents/README.md
integrations/coding-agents/claude-code/.claude-plugin/plugin.json
integrations/coding-agents/claude-code/README.md
integrations/coding-agents/claude-code/hooks/hooks.json
integrations/coding-agents/codex/.codex-plugin/plugin.json
integrations/coding-agents/codex/README.md
integrations/coding-agents/codex/hooks/hooks.json
integrations/coding-agents/cursor/.cursor/hooks.json
integrations/coding-agents/cursor/README.md
justfile
python/nemo_flow/_native.pyi
python/nemo_flow/llm.py
python/tests/test_builtin_codecs.py

Signed-off-by: Will Killian <wkillian@nvidia.com>

coderabbitai

Actionable comments posted: 10

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/sidecar/src/gateway.rs`:
- Around line 503-558: The proxy must also strip any header names listed in the
Connection header; update response_headers to read the "connection" header from
the provided HeaderMap, parse its comma-separated tokens (trim and lowercase),
and skip any header whose name matches one of those tokens in addition to
existing is_hop_by_hop checks; similarly, ensure request-path filters
(should_forward_request_header and should_record_header) consider
Connection-listed names as hop-by-hop (you may pass the request's HeaderMap or a
precomputed set of connection tokens into these checks), and perform comparisons
case-insensitively when converting tokens to HeaderName before excluding them.
- Around line 537-542: The should_record_header function currently allows
sensitive headers like cookie and generic API-key aliases to be recorded; update
should_record_header (and keep its dependence on should_forward_request_header)
to also exclude http::header::COOKIE and to reject common API-key header names
in a case-insensitive way (e.g., "api-key", "apikey", and any other known
aliases you want blocked in addition to "x-api-key" and "anthropic-api-key") so
those values never appear in LlmRequest.headers or downstream exports.

In `@crates/sidecar/src/installer.rs`:
- Around line 655-663: The current write_planned_file function truncates the
target in place; change it to write atomically by creating a temp file in the
same directory (use file.path.parent()), write file.contents into that temp
file, flush and persist it, then std::fs::rename the temp over file.path; keep
the existing backup logic (backup_path(&file.path)?) before replacing, and on
success remove the backup if needed; also preserve permissions/metadata by
reading metadata from file.path (if it exists) and applying them to the temp
file before rename, and ensure the temp file is cleaned up on failure.
- Around line 372-440: The generated hook commands break on Windows because
shell_quote() uses POSIX single-quote escaping; update quoting to be
platform-aware: keep the current POSIX single-quote logic for non-Windows, and
on Windows implement proper double-quote escaping (wrap in "..." and escape
inner quotes and special cmd/powershell metacharacters) so arguments like
--sidecar-url, paths and JSON are usable in cmd.exe and PowerShell; change
shell_quote() accordingly and ensure hook_command (and helpers
push_optional_path/push_optional_gateway_mode) continue to call the new
shell_quote signature so all assembled args are quoted per-platform.

In `@crates/sidecar/src/launcher.rs`:
- Around line 474-479: The product name is mis-capitalized in the Codex provider
config string; update the display name inside codex_sidecar_provider_config by
changing "Nemo Flow OpenAI" to "NeMo Flow OpenAI" in the format string used to
build the provider entry (and search for other occurrences of the same literal
to keep naming consistent in user-facing config). Ensure the change is made in
the codex_sidecar_provider_config function's format(...) call that constructs
model_providers.nemo-flow-openai.

In `@crates/sidecar/src/session.rs`:
- Around line 175-190: The code currently falls back to a constant synthetic id
("gateway-gateway") when headers lack session_id and there isn't exactly one
hook session, causing shared state; change the fallback to generate a unique
per-request synthetic id (e.g., use a UUID or timestamp) when session_id is None
and single_active_session_id(&sessions) returns None, create the Session via
Session::new as now but mark it as synthetic (or track it locally), and after
calling session.start_llm(start).await ensure cleanup: if the session was
created as a synthetic gateway session and there is no hook-owned state left,
remove that entry from the sessions map (or schedule removal) once the ActiveLlm
ends so synthetic sessions don't persist and leak hints; key symbols: session_id
variable, single_active_session_id, Session::new, session.start_llm, and the
sessions (self.inner) map.
- Around line 147-157: The current guard "if event.is_terminal() &&
!sessions.contains_key(&session_id) { continue; }" drops terminal events like
LlmEnded and ToolEnded for new sessions so their post-only hooks never run;
change the condition to only skip teardown-only retry events instead of all
terminal events — e.g. narrow the check to skip only when the event is a
teardown-retry (or explicitly exclude LlmEnded and ToolEnded) so that terminal
Llm/Tool events still create a Session via Session::new(...) and reach
session.apply(...) (which triggers end_hook_llm / end_tool). Ensure you update
the conditional around event.is_terminal() and
sessions.contains_key(&session_id) to allow LlmEnded and ToolEnded to proceed.

In `@integrations/coding-agents/codex/hooks/hooks.json`:
- Around line 3-149: All hook entries currently invoke the nemo-flow-sidecar
binary directly (the "command" values like "nemo-flow-sidecar hook-forward
codex" used in hooks such as SessionStart, UserPromptSubmit, PreToolUse,
PostToolUse, PostToolUseFailure, AfterAgentResponse, AfterAgentThought,
SubagentStart/Stop, Notification, Stop, PreCompact, SessionEnd), which breaks
optionality when the binary is missing; change each hook to call a small
launcher/shim (e.g., nemo-flow-sidecar-launcher or nemo-flow-sidecar-shim) that
attempts to exec the real nemo-flow-sidecar but exits successfully (0) and
no-ops when the binary is not present so Codex hooks preserve native behavior.
Ensure every "command" string in the JSON is updated to the shim name and that
the shim implements the fail-open behavior.

In `@integrations/coding-agents/codex/README.md`:
- Around line 100-101: Update the user-facing provider name to use the correct
product capitalization: change the name value under the
[model_providers.nemo-flow-openai] section (currently "Nemo Flow OpenAI") to
"NeMo Flow OpenAI" so the README/config matches the product branding.

In `@integrations/coding-agents/cursor/.cursor/hooks.json`:
- Around line 3-174: The hooks call the nemo-flow-sidecar binary unconditionally
(every "command": "nemo-flow-sidecar hook-forward cursor"), causing failures
when the binary is missing; create a fail-open launcher (e.g.,
nemo-flow-sidecar-shim) that checks for the real nemo-flow-sidecar and execs it
if present, otherwise exits successfully (and optionally logs a debug message),
then update all hook entries to invoke the shim (replace "nemo-flow-sidecar
hook-forward cursor" with "nemo-flow-sidecar-shim hook-forward cursor") so the
integration remains optional while preserving original behavior when NeMo Flow
is absent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: c1565467-6e9c-42e1-8045-8e55cf54fca9

📥 Commits

Reviewing files that changed from the base of the PR and between dec9bfd and d336a47.

📒 Files selected for processing (24)

.github/ci-path-filters.yml
codecov.yml
crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs
crates/python/src/py_api/mod.rs
crates/sidecar/src/error.rs
crates/sidecar/src/gateway.rs
crates/sidecar/src/installer.rs
crates/sidecar/src/launcher.rs
crates/sidecar/src/model.rs
crates/sidecar/src/server.rs
crates/sidecar/src/session.rs
crates/sidecar/tests/coverage/gateway_tests.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/tests/coverage/server_tests.rs
crates/sidecar/tests/coverage/session_tests.rs
docs/integrate-frameworks/coding-agent-codex.md
integrations/coding-agents/README.md
integrations/coding-agents/codex/.codex-plugin/plugin.json
integrations/coding-agents/codex/README.md
integrations/coding-agents/codex/hooks/hooks.json
integrations/coding-agents/cursor/.cursor/hooks.json
python/nemo_flow/llm.py
python/tests/test_builtin_codecs.py

📜 Review details

🧰 Additional context used

📓 Path-based instructions (45)

**/{integrations,integration,*-integration}/**

📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)

**/{integrations,integration,*-integration}/**: Keep NeMo Flow optional in framework integrations
Preserve the framework's original behavior when NeMo Flow is absent
Wrap tool and LLM paths at the correct framework boundary
Integration pattern must follow docs/integrate-frameworks/adding-scopes.md

Files:

integrations/coding-agents/codex/hooks/hooks.json
integrations/coding-agents/codex/README.md
integrations/coding-agents/README.md

**/*.{rs,py,js,ts,go,sh,toml,yaml,yml,json,md,rst}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

integrations/coding-agents/codex/hooks/hooks.json
integrations/coding-agents/codex/README.md
python/nemo_flow/llm.py
integrations/coding-agents/README.md
crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs
crates/sidecar/src/server.rs
docs/integrate-frameworks/coding-agent-codex.md
codecov.yml
crates/sidecar/src/model.rs
python/tests/test_builtin_codecs.py
crates/python/src/py_api/mod.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/installer.rs
crates/sidecar/tests/coverage/session_tests.rs
crates/sidecar/src/launcher.rs
crates/sidecar/src/error.rs
crates/sidecar/tests/coverage/gateway_tests.rs
crates/sidecar/tests/coverage/server_tests.rs
crates/sidecar/src/gateway.rs
crates/sidecar/src/session.rs

**/*.{md,rst,html,txt}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)

**/*.{md,rst,html,txt}: Always spell NVIDIA in all caps. Do not use Nvidia, nvidia, nVidia, nVIDIA, or NV.
Use an NVIDIA before a noun because the name starts with an 'en' sound.
Do not add a registered trademark symbol after NVIDIA when referring to the company.
Use trademark symbols with product names only when the document type or legal guidance requires them.
Verify official capitalization, spacing, and hyphenation for product names.
Precede NVIDIA product names with NVIDIA on first mention when it is natural and accurate.
Do not rewrite product names for grammar or title-case rules.
Preserve third-party product names according to the owner's spelling.
Include the company name and full model qualifier on first use when it helps identify the model.
Preserve the official capitalization and punctuation of model names.
Use shorter family names only after the full name is established.
Spell out a term on first use and put the acronym in parentheses unless the acronym is widely understood by the intended audience.
Use the acronym on later mentions after it has been defined.
For long documents, reintroduce the full term if readers might lose context.
Form plurals of acronyms with s, not an apostrophe, such as GPUs.
In headings, common acronyms can remain abbreviated. Spell out the term in the first or second sentence of the body.
Common terms such as CPU, GPU, PC, API, and UI usually do not need to be spelled out for developer audiences.

Files:

integrations/coding-agents/codex/README.md
integrations/coding-agents/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,rst,html}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)

Link the first mention of a product name when the destination helps the reader.

Files:

integrations/coding-agents/codex/README.md
integrations/coding-agents/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,rst,txt}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-guide.md)

**/*.{md,rst,txt}: Spell NVIDIA in all caps. Do not use Nvidia, nvidia, or NV.
Format commands, code elements, expressions, package names, file names, and paths as inline code.
Use descriptive link text. Avoid raw URLs and weak anchors such as 'here' or 'read more.'
Use title case consistently for technical documentation headings.
Introduce code blocks, lists, tables, and images with complete sentences.
Write procedures as imperative steps. Keep steps parallel and split long procedures into smaller tasks.
Prefer active voice, present tense, short sentences, contractions, and plain English.
Use can for possibility and reserve may for permission.
Use after for temporal relationships instead of once.
Prefer refer to over see when the wording points readers to another resource.
Avoid culture-specific idioms, unnecessary Latinisms, jokes, and marketing exaggeration in technical documentation.
Spell out months in body text, avoid ordinal dates, and use clear time zones.
Spell out whole numbers from zero through nine unless they are technical values, parameters, versions, or UI values.
Use numerals for 10 or greater and include commas in thousands.
Do not add trademark symbols to learning-oriented documentation unless the source, platform, or legal guidance explicitly requires them.
Do not add trademark symbols to NeMo Flow learning documentation by default.
Do not rewrite API names, package names, command flags, or code literals for style reasons.

Files:

integrations/coding-agents/codex/README.md
integrations/coding-agents/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,markdown,rst}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-technical-docs.md)

**/*.{md,markdown,rst}: Use title case consistently in technical documentation headings
Avoid quotation marks, ampersands, and exclamation marks in headings
Keep product, event, research, and whitepaper names in their official title case
Use title case for table headers
Do not force social-media sentence case into technical docs
Use monospace formatting for code elements, commands, parameters, package names, and expressions
Use monospace formatting for directories, file names, and paths
Use angle brackets inside monospace for variables inside paths, such as /home/<username>/.login
Use quotation marks for error messages and strings in documentation
Use bold formatting for UI buttons, menus, fields, and labels in documentation
Use angle brackets between UI labels for menu paths, such as File > Save As
Use italics for new terms on first use in documentation
Use italics for publication titles in documentation
Use plain text formatting for keyboard shortcuts in documentation
Prefer [NVIDIA/NeMo](link) format for GitHub repository references over generic phrases like 'the GitHub repo'
Introduce every code block with a complete sentence
Do not make a code block complete the grammar of the previous sentence
Do not continue a sentence after a code block
Use syntax highlighting when the format supports it for code blocks
Avoid the word 'snippet' unless the surrounding docs already use it as a term of art
Keep inline method, function, and class references consistent with nearby docs, omitting empty parentheses for prose readability when no call is shown
Use descriptive anchor text that matches the destination title when possible for links
Avoid raw URLs in running text in documentation
Avoid generic link anchors such as 'here,' 'this page,' and 'read more' in documentation
Include the acronym in link text if a linked term includes an acronym
Do not link long sentences or multiple sentences in documentation
Avoid links that pull readers away from a procedure unles...

Files:

integrations/coding-agents/codex/README.md
integrations/coding-agents/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{html,md}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Include SPDX license headers in HTML and Markdown files using HTML comment syntax

Files:

integrations/coding-agents/codex/README.md
integrations/coding-agents/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/README.md

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Update relevant crate or package README when that surface changed

Relevant package or crate README.md files must be updated when examples or binding guidance changes

Files:

integrations/coding-agents/codex/README.md
integrations/coding-agents/README.md

**/*.md

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Run Markdown link checking via lychee for README.md, CONTRIBUTING.md, and docs/ through pre-commit hooks

Files:

integrations/coding-agents/codex/README.md
integrations/coding-agents/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,markdown,py,sh,bash,js,ts,java,cpp,go,rust}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep package names, repo references, and build commands current in documentation

Files:

integrations/coding-agents/codex/README.md
python/nemo_flow/llm.py
integrations/coding-agents/README.md
docs/integrate-frameworks/coding-agent-codex.md
python/tests/test_builtin_codecs.py

**/*.{md,markdown,py,sh,bash}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep stable user-facing wrappers at scripts/ root in docs and examples; only point at namespaced helper paths when documenting internal maintenance work

Files:

integrations/coding-agents/codex/README.md
python/nemo_flow/llm.py
integrations/coding-agents/README.md
docs/integrate-frameworks/coding-agent-codex.md
python/tests/test_builtin_codecs.py

**/*.{md,markdown,py,sh,bash,js,ts,example}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Example commands must match current package names and paths

Files:

integrations/coding-agents/codex/README.md
python/nemo_flow/llm.py
integrations/coding-agents/README.md
docs/integrate-frameworks/coding-agent-codex.md
python/tests/test_builtin_codecs.py

**/*.{md,txt,rst}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)

**/*.{md,txt,rst}: Ensure commands, package names, file paths, and APIs in documentation are correct and not stale; flag incorrect or outdated information as blocking issues
Ensure examples and procedures in documentation will execute successfully with current APIs and commands
Use consistent user-facing terminology throughout documentation that matches current repo terminology
Capitalize NVIDIA correctly in all documentation and public-facing text
Format code, commands, paths, and filenames as inline code (monospace) in documentation
Use descriptive anchor text for links instead of bare URLs or weak labels like 'here' in documentation
Prefer active voice, present tense, short sentences, and plain English in documentation
Structure documentation procedures as imperative steps that are easy to scan and not too long for a single sequence
Prefer 'after' instead of 'once' for temporal references in documentation
Use 'can' instead of 'may' when describing possibility (rather than permission) in documentation
Avoid ambiguous numeric dates and ordinal dates in documentation body text

Files:

integrations/coding-agents/codex/README.md
integrations/coding-agents/README.md
docs/integrate-frameworks/coding-agent-codex.md

{README.md,docs/index.md,**/README.md}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)

Update entry-point documentation (README.md, docs/index.md, package READMEs, binding-level source READMEs) whenever public behavior changes

Files:

integrations/coding-agents/codex/README.md
integrations/coding-agents/README.md

**/*.{py,js,ts,tsx,go,rs,md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Format changed files with the language-native formatter before the final lint/test pass

Files:

integrations/coding-agents/codex/README.md
python/nemo_flow/llm.py
integrations/coding-agents/README.md
crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs
crates/sidecar/src/server.rs
docs/integrate-frameworks/coding-agent-codex.md
crates/sidecar/src/model.rs
python/tests/test_builtin_codecs.py
crates/python/src/py_api/mod.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/installer.rs
crates/sidecar/tests/coverage/session_tests.rs
crates/sidecar/src/launcher.rs
crates/sidecar/src/error.rs
crates/sidecar/tests/coverage/gateway_tests.rs
crates/sidecar/tests/coverage/server_tests.rs
crates/sidecar/src/gateway.rs
crates/sidecar/src/session.rs

**/*.py

📄 CodeRabbit inference engine (.agents/skills/test-python-binding/SKILL.md)

Format changed Python wrapper and test files with uv run ruff format python

**/*.py: Use Ruff with rule sets E, F, W, I for Python linting
Use Ruff formatter for Python code with line length 120 and double quotes
Use ty for Python type checking
Use snake_case naming convention for Python code
Include SPDX license headers in all Python source files using Python comment syntax

Files:

python/nemo_flow/llm.py
python/tests/test_builtin_codecs.py

**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Use SONAR_IGNORE_START / SONAR_IGNORE_END markers only for documented false positives that cannot be resolved in code; keep ignored blocks small, add explanatory comments, and require reviewer sign-off

Files:

python/nemo_flow/llm.py
crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs
crates/sidecar/src/server.rs
crates/sidecar/src/model.rs
python/tests/test_builtin_codecs.py
crates/python/src/py_api/mod.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/installer.rs
crates/sidecar/tests/coverage/session_tests.rs
crates/sidecar/src/launcher.rs
crates/sidecar/src/error.rs
crates/sidecar/tests/coverage/gateway_tests.rs
crates/sidecar/tests/coverage/server_tests.rs
crates/sidecar/src/gateway.rs
crates/sidecar/src/session.rs

python/nemo_flow/**/*.py

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Python API implementations and wrappers

Files:

python/nemo_flow/llm.py

{docs/**,examples/**,crates/adaptive/**,python/nemo_flow/**,go/nemo_flow/**,**/{example,component}.{ts,tsx,js,rs,py,go}}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Any new adaptive component kind must have documentation, examples, and binding coverage across all supported languages

Files:

python/nemo_flow/llm.py
docs/integrate-frameworks/coding-agent-codex.md

**/*.{js,ts,tsx,jsx,py,rs,go,java,c,cpp,h,cc,cxx,cs,rb,php,swift,kt}

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changed files must be formatted with the language-native formatter

Files:

python/nemo_flow/llm.py
crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs
crates/sidecar/src/server.rs
crates/sidecar/src/model.rs
python/tests/test_builtin_codecs.py
crates/python/src/py_api/mod.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/installer.rs
crates/sidecar/tests/coverage/session_tests.rs
crates/sidecar/src/launcher.rs
crates/sidecar/src/error.rs
crates/sidecar/tests/coverage/gateway_tests.rs
crates/sidecar/tests/coverage/server_tests.rs
crates/sidecar/src/gateway.rs
crates/sidecar/src/session.rs

python/**/*.py

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

python/**/*.py: If Python language surface changed, always run Python test target even when Rust core did not change
For Python changes, use uv run ruff format python to format Python files

Files:

python/nemo_flow/llm.py
python/tests/test_builtin_codecs.py

**/*.{rs,py,js,ts,tsx,go}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

During iteration, prefer uv run pre-commit run --files <changed files...> for targeted validation

Files:

python/nemo_flow/llm.py
crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs
crates/sidecar/src/server.rs
crates/sidecar/src/model.rs
python/tests/test_builtin_codecs.py
crates/python/src/py_api/mod.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/installer.rs
crates/sidecar/tests/coverage/session_tests.rs
crates/sidecar/src/launcher.rs
crates/sidecar/src/error.rs
crates/sidecar/tests/coverage/gateway_tests.rs
crates/sidecar/tests/coverage/server_tests.rs
crates/sidecar/src/gateway.rs
crates/sidecar/src/session.rs

python/nemo_flow/**/*

⚙️ CodeRabbit configuration file

python/nemo_flow/**/*: Review Python wrapper changes for typed API consistency, contextvars-based scope isolation, async behavior, and parity with the native extension.
Stubs and runtime implementations should stay aligned.

Files:

python/nemo_flow/llm.py

{.github/**/*.{yml,yaml},*.patch,scripts/**/*,*.sh,*.bat,Dockerfile*}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update CI configuration, patch files, and build scripts with new functional identifiers after rename operations

Files:

.github/ci-path-filters.yml

{.github/**,.gitlab-ci.yml,.pre-commit-config.yaml,justfile,scripts/**}

⚙️ CodeRabbit configuration file

{.github/**,.gitlab-ci.yml,.pre-commit-config.yaml,justfile,scripts/**}: Review automation changes for reproducibility, pinned versions where appropriate, secret handling, and consistency with the documented validation matrix.
Pay attention to commands that need generated native artifacts, FFI libraries, or platform-specific environment variables.

Files:

.github/ci-path-filters.yml

**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)

**/*.rs: Run cargo fmt --all for FFI work as it is Rust work
Run just test-rust for FFI validation
Run cargo clippy --workspace --all-targets -- -D warnings to enforce warnings-as-errors linting

When Rust files changed as part of Python work, run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for Rust code formatting
Run cargo clippy --workspace --all-targets -- -D warnings to enforce Rust linting with no warnings
Run just test-rust as the shared-runtime build/test wrapper for Rust changes

Use Rust snake_case naming convention for Rust code

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for Rust code formatting when Node changes touch Rust files
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting when Rust files changed as part of Node work

**/*.rs: Always run just test-rust when any Rust code changes
Always run cargo fmt --all when any Rust code changes
Always run cargo clippy --workspace --all-targets -- -D warnings when any Rust code changes

Files:

crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs
crates/sidecar/src/server.rs
crates/sidecar/src/model.rs
crates/python/src/py_api/mod.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/installer.rs
crates/sidecar/tests/coverage/session_tests.rs
crates/sidecar/src/launcher.rs
crates/sidecar/src/error.rs
crates/sidecar/tests/coverage/gateway_tests.rs
crates/sidecar/tests/coverage/server_tests.rs
crates/sidecar/src/gateway.rs
crates/sidecar/src/session.rs

**/*.{rs,go,js,ts,tsx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Include SPDX license headers in all Rust, Go, JavaScript, and TypeScript source files using C-style comment syntax

Files:

crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs
crates/sidecar/src/server.rs
crates/sidecar/src/model.rs
crates/python/src/py_api/mod.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/installer.rs
crates/sidecar/tests/coverage/session_tests.rs
crates/sidecar/src/launcher.rs
crates/sidecar/src/error.rs
crates/sidecar/tests/coverage/gateway_tests.rs
crates/sidecar/tests/coverage/server_tests.rs
crates/sidecar/src/gateway.rs
crates/sidecar/src/session.rs

crates/core/src/api/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

crates/core/src/api/**/*.rs: Implement public API changes first in crates/core/src/api/ and related core modules such as crates/core/src/api/runtime/, crates/core/src/codec/, or crates/core/src/json.rs
Use snake_case naming convention for Rust core API implementations

Files:

crates/core/src/api/llm.rs

{crates/core/src/api/tool.rs,crates/core/src/api/llm.rs}

📄 CodeRabbit inference engine (.agents/skills/add-middleware/SKILL.md)

Wire middleware chain execution into the appropriate pipeline stage in crates/core/src/api/tool.rs or crates/core/src/api/llm.rs by calling the chain method at the correct execution point

Files:

crates/core/src/api/llm.rs

crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs

{crates/core,crates/adaptive}/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-wasm-binding/SKILL.md)

If the change touched shared runtime semantics in crates/core or crates/adaptive, also use validate-change

Files:

crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs

crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

When crates/core changes, run the full validation matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs

crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

crates/core/src/api/llm.rs
crates/core/src/codec/traits.rs

docs/**/*.md

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Run ./scripts/build-docs.sh for documentation site changes

docs/**/*.md: Relevant getting-started or reference docs must be updated when examples change
Release-policy docs must point to GitHub Releases as the only release-history source of truth

docs/**/*.md: Use title case for headings in technical documentation
Introduce code blocks, tables, and lists with complete lead-in sentences in documentation

Files:

docs/integrate-frameworks/coding-agent-codex.md

{RELEASING.md,CHANGELOG.md,docs/**/*.md}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep release-process and release-notes guidance in repo-maintainer docs such as RELEASING.md, not as user-facing docs pages or CHANGELOG.md

Files:

docs/integrate-frameworks/coding-agent-codex.md

docs/integrate-frameworks/**/*.md

📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)

Documentation must be updated if activation or usage of the integration changed

Files:

docs/integrate-frameworks/coding-agent-codex.md

{scripts/*.sh,docs/**/*.md}

📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)

Use root ./scripts/*.sh commands in docs and contributor guidance as documented, with implementations under scripts/third-party/

Files:

docs/integrate-frameworks/coding-agent-codex.md

{README*,CHANGELOG*,docs/**/*.{md,rst,txt},examples/**/*,*.md}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update documentation, examples, and getting-started guides with new package/module/crate names after rename operations

Files:

docs/integrate-frameworks/coding-agent-codex.md

{README.md,docs/**/*.md,examples/**/*.{js,ts,py,go,rs}}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Keep documentation and examples synchronized with current install, import, and build commands

Files:

docs/integrate-frameworks/coding-agent-codex.md

{README.md,CONTRIBUTING.md,docs/**/*.md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

For docs-only changes, run targeted checks only if commands, package names, or examples changed. Use just docs for docs-site builds and just docs-linkcheck when links changed

Files:

docs/integrate-frameworks/coding-agent-codex.md

{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}

⚙️ CodeRabbit configuration file

{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}: Review documentation for technical accuracy against the current API, command correctness, and consistency across language bindings.
Flag stale examples, missing SPDX headers where required, and instructions that no longer match CI or pre-commit behavior.

Files:

docs/integrate-frameworks/coding-agent-codex.md

**/test_*.py

📄 CodeRabbit inference engine (.agents/skills/test-python-binding/SKILL.md)

Run focused pytest first when you know the affected area using pytest -k "<pattern>"

Files:

python/tests/test_builtin_codecs.py

{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

python/tests/test_builtin_codecs.py
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/tests/coverage/session_tests.rs
crates/sidecar/tests/coverage/gateway_tests.rs
crates/sidecar/tests/coverage/server_tests.rs

crates/python/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-python-binding/SKILL.md)

When the native Rust bridge changed, add the Rust crate tests for nemo-flow-python

Files:

crates/python/src/py_api/mod.rs

crates/{python,ffi,node,wasm}/**/*

⚙️ CodeRabbit configuration file

crates/{python,ffi,node,wasm}/**/*: Treat binding changes as public API changes. Check for parity with the other language bindings, FFI ownership/lifetime safety,
callback error propagation, stable type conversion, and consistent async/stream semantics.
Flag changes that update one binding without corresponding tests or documentation for the same surface elsewhere.

Files:

crates/python/src/py_api/mod.rs

🪛 LanguageTool

integrations/coding-agents/README.md

[style] ~132-~132: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ay-mode hook-only|passthrough|required` records the expected gateway behavior in sess...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

[style] ~134-~134: To form a complete sentence, be sure to include a subject.
Context: ... in session metadata. - --fail-closed can be added to generated hook commands whe...

(MISSING_IT_THERE)

🪛 Ruff (0.15.12)

python/tests/test_builtin_codecs.py

[warning] 194-194: Missing return type annotation for private function capture

Add return type annotation: None

(ANN202)

[warning] 241-241: Missing return type annotation for private function capture

Add return type annotation: None

(ANN202)

[warning] 277-277: Missing return type annotation for private function capture

Add return type annotation: None

(ANN202)

[warning] 280-280: Missing return type annotation for private function sanitize_response

(ANN202)

[warning] 280-280: Unused function argument: response

(ARG001)

[warning] 322-322: Missing return type annotation for private function capture

Add return type annotation: None

(ANN202)

🔇 Additional comments (21)

codecov.yml (2)

57-66: Sidecar runtime coverage component is correctly scoped and gated.

This matches the existing component policy shape and sets a clear quality gate for crates/sidecar/src.

125-125: Explicit sidecar test exclusion is fine for component-focused runtime coverage.

This keeps component coverage focused on production sources under crates/sidecar/src.

crates/python/src/py_api/mod.rs (2)

632-659: Past review concern resolved.

The implementation now delegates response_codec to core_llm_api::llm_call_end, which performs decoding from the sanitized end-event payload. Decode failures are propagated back to Python as RuntimeError after the end event is emitted.

51-98: LGTM!

Helper functions py_llm_response_codec and py_annotated_llm_response cleanly handle:

Fast-path extraction for built-in codecs (avoiding Python dispatch overhead).

Fallback wrapper for custom Python codecs.

JSON deserialization with proper error propagation.

python/tests/test_builtin_codecs.py (3)

318-342: Good error-path coverage.

This test validates the documented behavior that codec decode failures:

Raise RuntimeError to the caller.

Still emit exactly one end event.

Set annotated_response to None on the emitted event.

This addresses the previous review request for failure-path assertions.

273-316: LGTM!

Validates that response_codec decoding operates on the sanitized payload rather than the raw input. The unused response parameter in sanitize_response is required by the guardrail callback signature.

190-271: LGTM!

Tests properly validate:

response_codec produces decoded annotation with usage/response_text.

Explicit annotated_response mapping is accepted and preserved.

Both use proper test isolation with subscriber registration/deregistration.

crates/core/src/codec/traits.rs (1)

65-77: LGTM!

Documentation accurately describes the fallible decode semantics and aligns with the llm_call_end implementation that emits the end event even when decoding fails.

crates/core/src/api/llm.rs (1)

388-449: LGTM!

The implementation correctly:

Sanitizes the response before codec decoding (line 413).

Derives annotation from the sanitized payload when annotated_response is not provided (lines 419-431).

Emits the end event regardless of decode outcome.

Returns decode errors after event emission for caller visibility.

This addresses the previous review concern about pre-sanitized annotation content.

python/nemo_flow/llm.py (1)

134-180: LGTM!

The call_end() API extension:

Properly types annotated_response as union of AnnotatedLLMResponse | Mapping[str, Json] | None.

Documents that codec failures raise after event emission.

Cleanly forwards to native implementation.

integrations/coding-agents/README.md (1)

1-167: LGTM!

Documentation is well-structured with proper SPDX header, consistent heading case, code block formatting, and clear procedural guidance. The static analysis hints about successive sentences are false positives for a CLI option list where each item naturally starts with a backtick-quoted flag.

integrations/coding-agents/codex/.codex-plugin/plugin.json (1)

1-32: LGTM!

Plugin manifest is well-formed with correct metadata, license, repository references, and hook wiring path.

crates/sidecar/src/server.rs (1)

70-98: LGTM!

HTTP client now properly configures connect, request, and read timeouts. Route setup correctly handles both prefixed (/v1/...) and non-prefixed paths for OpenAI compatibility. The expect() on the builder is acceptable since the configuration is compile-time static.

crates/sidecar/tests/coverage/server_tests.rs (2)

18-33: LGTM!

TestServer now properly implements RAII cleanup via Drop::drop that aborts the spawned task, addressing the prior review concern about leaked server tasks.

47-471: Comprehensive test coverage.

Tests cover all hook endpoints (Codex, Claude Code, Cursor, Hermes), gateway passthrough behavior (JSON forwarding, streaming preservation, header handling), error cases (unreachable upstream, unsupported paths), and models endpoint forwarding.

crates/sidecar/tests/coverage/launcher_tests.rs (2)

8-11: Good use of OnceLock for test serialization.

The current_dir_lock() pattern correctly serializes tests that mutate the process-global current directory, preventing races between concurrent test execution.

499-524: Integration test properly validates sidecar URL injection.

The run_starts_sidecar_injects_env_and_returns_agent_exit_code test correctly verifies that the dynamic sidecar URL is injected into the child process environment and that exit codes are preserved.

crates/sidecar/tests/coverage/gateway_tests.rs (2)

321-389: Streaming guard test now uses robust polling.

The test correctly uses wait_for_session_llms_empty with a 5-second timeout instead of a fixed 100ms sleep, addressing the prior review concern about flaky failures on loaded CI systems.

14-31: Polling helper addresses prior race condition with timeout-based retries.

The wait_for_file_contains and wait_for_session_llms_empty helpers replace fixed sleeps with timeout-based polling (25ms intervals, 5s timeout), making the streaming guard drop test robust under CI load.

The let-chains syntax in lines 21–23 is compatible with the project's Rust edition 2024 and requires no feature flags.

crates/sidecar/src/launcher.rs (2)

29-120: Well-structured transparent run orchestration.

The separation of TransparentRun::new (config resolution, binding), print_if_requested (dry-run inspection), and execute (live execution) provides clean separation of concerns. The execute_live_run function correctly orders: start server → wait health → spawn child → restore state → stop server, preserving child exit status.

458-472: Health polling is reasonable but consider timeout alignment.

The health check polls 50 times with 20ms sleeps (~1 second total). The Client::new() here has no explicit timeout, relying on system defaults. For consistency with the server's configured timeouts, consider using a short connect timeout on this health-check client.

This is minor since health checks are typically fast and the retry loop provides implicit timeout behavior.

coderabbitai · 2026-05-08T20:29:02Z

+fn response_headers(headers: &HeaderMap) -> HeaderMap {
+    let mut output = HeaderMap::new();
+    for (name, value) in headers {
+        if !is_hop_by_hop(name) {
+            output.append(name.clone(), value.clone());
+        }
+    }
+    output
+}
+
+// Reconstructs an Axum response from upstream status, filtered headers, and the selected body. All
+// builder errors are converted into sidecar HTTP errors rather than panics.
+fn build_response(
+    status: StatusCode,
+    headers: HeaderMap,
+    body: Body,
+) -> Result<Response<Body>, SidecarError> {
+    let mut builder = Response::builder().status(status);
+    for (name, value) in &headers {
+        builder = builder.header(name, value);
+    }
+    Ok(builder.body(body)?)
+}
+
+// Allows provider request headers through unless they are transport-owned or must be recalculated
+// for the forwarded body. Host and content length are intentionally excluded because reqwest sets
+// them for the upstream connection.
+fn should_forward_request_header(name: &HeaderName) -> bool {
+    !is_hop_by_hop(name) && name != http::header::HOST && name != http::header::CONTENT_LENGTH
+}
+
+// Allows headers into observability metadata only after removing credentials and provider API keys.
+// The forwarding filter runs first so hop-by-hop transport headers are also excluded from recorded
+// LLM request attributes.
+fn should_record_header(name: &HeaderName) -> bool {
+    should_forward_request_header(name)
+        && name != http::header::AUTHORIZATION
+        && name.as_str() != "x-api-key"
+        && name.as_str() != "anthropic-api-key"
+}
+
+// Identifies headers that describe a single transport hop and therefore must not be proxied across
+// the client-sidecar-upstream boundary.
+fn is_hop_by_hop(name: &HeaderName) -> bool {
+    matches!(
+        name.as_str(),
+        "connection"
+            | "keep-alive"
+            | "proxy-authenticate"
+            | "proxy-authorization"
+            | "te"
+            | "trailer"
+            | "transfer-encoding"
+            | "upgrade"
+    )
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Strip headers named by Connection too.

The proxy only removes the fixed hop-by-hop names. If either side sends Connection: x-foo, x-foo is still forwarded here, which violates proxy semantics and can leak per-connection metadata or break upstream behavior. Parse the Connection tokens from the current header map and exclude those names in both request and response paths.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/gateway.rs` around lines 503 - 558, The proxy must also strip any header names listed in the Connection header; update response_headers to read the "connection" header from the provided HeaderMap, parse its comma-separated tokens (trim and lowercase), and skip any header whose name matches one of those tokens in addition to existing is_hop_by_hop checks; similarly, ensure request-path filters (should_forward_request_header and should_record_header) consider Connection-listed names as hop-by-hop (you may pass the request's HeaderMap or a precomputed set of connection tokens into these checks), and perform comparisons case-insensitively when converting tokens to HeaderName before excluding them.

coderabbitai · 2026-05-08T20:29:02Z

+fn should_record_header(name: &HeaderName) -> bool {
+    should_forward_request_header(name)
+        && name != http::header::AUTHORIZATION
+        && name.as_str() != "x-api-key"
+        && name.as_str() != "anthropic-api-key"
+}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not record cookie or generic API-key aliases.

should_record_header still allows secrets like cookie and api-key, so those values can end up in LlmRequest.headers and downstream exports. They should be filtered alongside Authorization, x-api-key, and anthropic-api-key.

Proposed fix

fn should_record_header(name: &HeaderName) -> bool { should_forward_request_header(name) && name != http::header::AUTHORIZATION + && name != http::header::COOKIE && name.as_str() != "x-api-key" + && name.as_str() != "api-key" && name.as_str() != "anthropic-api-key" }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fn should_record_header(name: &HeaderName) -> bool {

should_forward_request_header(name)

&& name != http::header::AUTHORIZATION

&& name.as_str() != "x-api-key"

&& name.as_str() != "anthropic-api-key"

}

fn should_record_header(name: &HeaderName) -> bool {

should_forward_request_header(name)

&& name != http::header::AUTHORIZATION

&& name != http::header::COOKIE

&& name.as_str() != "x-api-key"

&& name.as_str() != "api-key"

&& name.as_str() != "anthropic-api-key"

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/gateway.rs` around lines 537 - 542, The should_record_header function currently allows sensitive headers like cookie and generic API-key aliases to be recorded; update should_record_header (and keep its dependence on should_forward_request_header) to also exclude http::header::COOKIE and to reject common API-key header names in a case-insensitive way (e.g., "api-key", "apikey", and any other known aliases you want blocked in addition to "x-api-key" and "anthropic-api-key") so those values never appear in LlmRequest.headers or downstream exports.

coderabbitai · 2026-05-08T20:29:02Z

+fn hook_command(command: &InstallCommand, agent: CodingAgent) -> String {
+    let mut args = vec![
+        "nemo-flow-sidecar".to_string(),
+        "hook-forward".to_string(),
+        agent.as_arg().to_string(),
+        "--sidecar-url".to_string(),
+        command.sidecar_url.clone(),
+    ];
+    push_optional_path(&mut args, "--atif-dir", command.atif_dir.as_deref());
+    push_optional(
+        &mut args,
+        "--openinference-endpoint",
+        command.openinference_endpoint.as_deref(),
+    );
+    push_optional(&mut args, "--profile", command.profile.as_deref());
+    push_optional(
+        &mut args,
+        "--session-metadata",
+        command.session_metadata.as_deref(),
+    );
+    push_optional(
+        &mut args,
+        "--plugin-config",
+        command.plugin_config.as_deref(),
+    );
+    push_optional_gateway_mode(&mut args, command.gateway_mode);
+    args.into_iter()
+        .map(|arg| shell_quote(&arg))
+        .collect::<Vec<_>>()
+        .join(" ")
+}
+
+// Appends a flag/value pair only when a string option is present, preserving omission semantics in
+// generated hook commands instead of serializing empty values.
+fn push_optional(args: &mut Vec<String>, flag: &str, value: Option<&str>) {
+    if let Some(value) = value {
+        args.push(flag.to_string());
+        args.push(value.to_string());
+    }
+}
+
+// Appends optional path flags using display formatting because installed commands are read by a
+// shell, not by Rust path parsers.
+fn push_optional_path(args: &mut Vec<String>, flag: &str, value: Option<&Path>) {
+    if let Some(value) = value {
+        args.push(flag.to_string());
+        args.push(value.display().to_string());
+    }
+}
+
+// Serializes the gateway-mode enum into the generated hook-forward command only when explicitly
+// configured, leaving default runtime behavior under the sidecar's normal config resolution.
+fn push_optional_gateway_mode(args: &mut Vec<String>, gateway_mode: Option<GatewayMode>) {
+    if let Some(gateway_mode) = gateway_mode {
+        args.push("--gateway-mode".to_string());
+        args.push(gateway_mode.as_arg().to_string());
+    }
+}
+
+// Quotes a shell argument only when necessary. The safe character set is intentionally small so
+// paths and URLs remain readable while whitespace, quotes, and shell metacharacters are protected.
+fn shell_quote(value: &str) -> String {
+    if value
+        .chars()
+        .all(|character| character.is_ascii_alphanumeric() || "-_./:=,".contains(character))
+    {
+        value.to_string()
+    } else {
+        format!("'{}'", value.replace('\'', "'\\''"))


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🌐 Web query:

For Claude Code, Codex CLI, Cursor, and Hermes on Windows: are hook commands executed through a POSIX shell, cmd.exe, or PowerShell, and what quoting/escaping syntax is required for arguments containing spaces or quotes?

💡 Result:

Claude Code hooks on Windows: Executed through POSIX shell (bash.exe by default, spawning Git Bash), or PowerShell if specified via "shell": "powershell" field. [1][2] Default bash quoting/escaping: single quotes for strings with spaces, escape internal single quotes with ''' or close/reopen. PowerShell example uses powershell.exe -Command "..." with internal single quotes. [3] Codex CLI on Windows: Primarily PowerShell (native support, sandbox), but issues show cmd.exe /c wrappers and bat shims causing quoting issues for spaces/multiline; use PowerShell quoting (double outer, single inner) or CMD escaping (^" or doubled "). [4][5][6][7] Cursor hooks on Windows: Command hooks run as shell scripts (typically bash via Git Bash or cmd/PowerShell via wrappers like run-hook.cmd to avoid file-open issues); issues report PowerShell injection with Git Bash terminal and execution failures without.cmd wrappers. Bash/POSIX quoting recommended, or explicit.cmd for cmd.exe/PowerShell. [8][9][10] Hermes Agent on Windows: Native Windows not supported; requires WSL2 (POSIX shell). No hook commands mentioned. [11][12]

Citations:

1: [Windows] Statusline and hook commands spawn bash.exe without CREATE_NO_WINDOW, causing visible console flash every ~10 seconds anthropics/claude-code#51867

2: https://code.claude.com/docs/en/hooks

3: https://code.claude.com/docs/en/hooks-guide?8adb0641_page=4&cc61befa_page=2&d7430fcd_page=5&r=0

4: Agent intermittently emits PowerShell-only syntax and fragile quoting openai/codex#9581

5: Windows/pwsh: apply_patch.bat breaks multiline patch arguments; direct codex.exe --codex-run-as-apply-patch succeeds with the same patch openai/codex#13840

6: https://help.openai.com/en/articles/11381614-codex-codex-andsign-in-with-chatgpt

7: feat: configurable Windows agent shell (powershell/git-bash) openai/codex#16717

8: https://cursor.com/docs/hooks

9: Cursor on Windows opens hooks/session-start file instead of executing SessionStart hook obra/superpowers#1449

10: Cursor on Windows: SessionStart hook fails when PATH bash is WSL; use run-hook.cmd in hooks-cursor.json obra/superpowers#912

11: https://dev.to/rosgluk/hermes-agent-cli-cheat-sheet-commands-flags-and-slash-shortcuts-3pcb

12: https://nousresearch-hermes-agent.mintlify.app/introduction

🏁 Script executed:

# Check for Windows-specific handling in installer.rs grep -n "windows\|Windows\|cmd\|powershell\|platform\|cfg" crates/sidecar/src/installer.rs | head -20

Repository: NVIDIA/NeMo-Flow

Length of output: 174

🏁 Script executed:

# Search for platform-specific code patterns in the codebase rg -i "windows|cmd\.exe|powershell" --type rs -A 2 -B 2 | head -50

Repository: NVIDIA/NeMo-Flow

Length of output: 87

🏁 Script executed:

# Check if hook_command is used in a context that suggests Windows deployment rg "hook_command\|hook-forward" --type rs -A 3 -B 3 | head -50

Repository: NVIDIA/NeMo-Flow

Length of output: 87

🏁 Script executed:

# Look for any existing quoting/escaping utilities rg "quote\|escape" crates/sidecar/src/ --type rs | head -20

Repository: NVIDIA/NeMo-Flow

Length of output: 87

🏁 Script executed:

# Search for Windows/platform-specific code in Rust files rg -t rust "windows|cfg|platform" crates/sidecar/ -A 2 -B 2 | head -60

Repository: NVIDIA/NeMo-Flow

Length of output: 3170

🏁 Script executed:

# Look for hook_command usage rg -t rust "hook_command|hook-forward" --max-count 20 -A 3 -B 1

Repository: NVIDIA/NeMo-Flow

Length of output: 13107

🏁 Script executed:

# Check the full context of shell_quote function and its callers cat crates/sidecar/src/installer.rs | sed -n '420,440p'

Repository: NVIDIA/NeMo-Flow

Length of output: 963

🏁 Script executed:

# Search for any argv or command array building patterns rg -t rust "args|argv|Vec\<String\>" crates/sidecar/src/installer.rs -B 2 -A 2 | head -40

Repository: NVIDIA/NeMo-Flow

Length of output: 163

Windows hook commands with special characters will fail due to POSIX-only quoting.

The shell_quote function uses POSIX syntax ('...' and '\''), but generated hooks run through different shells on Windows: Git Bash (POSIX), PowerShell, and cmd.exe. Cmd.exe doesn't recognize single quotes and requires ^" or "" escaping; PowerShell uses different quote handling. Any hook command with spaces or quotes in URLs, paths, or JSON—including the --sidecar-url itself—becomes non-executable on Windows systems running cmd.exe or PowerShell.

Since the installer already supports cross-platform home resolution and the codebase includes Windows-specific test paths, platform-aware quoting for shell arguments is needed. Use conditional quoting based on the target environment, or prefer argv-style configuration where the target format permits.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/installer.rs` around lines 372 - 440, The generated hook commands break on Windows because shell_quote() uses POSIX single-quote escaping; update quoting to be platform-aware: keep the current POSIX single-quote logic for non-Windows, and on Windows implement proper double-quote escaping (wrap in "..." and escape inner quotes and special cmd/powershell metacharacters) so arguments like --sidecar-url, paths and JSON are usable in cmd.exe and PowerShell; change shell_quote() accordingly and ensure hook_command (and helpers push_optional_path/push_optional_gateway_mode) continue to call the new shell_quote signature so all assembled args are quoted per-platform.

coderabbitai · 2026-05-08T20:29:02Z

+fn write_planned_file(file: &PlannedFile) -> Result<(), SidecarError> {
+    if let Some(parent) = file.path.parent() {
+        std::fs::create_dir_all(parent)?;
+    }
+    if file.path.exists() {
+        std::fs::copy(&file.path, backup_path(&file.path)?)?;
+    }
+    std::fs::write(&file.path, &file.contents)?;
+    Ok(())


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Write planned files atomically.

This backs up the old file, but std::fs::write still truncates the live config in place. A crash or ENOSPC after truncation leaves the user's hook config corrupted until they manually restore the backup. Write to a temp file in the same directory and rename it over the target after the write succeeds.

Proposed fix

fn write_planned_file(file: &PlannedFile) -> Result<(), SidecarError> { if let Some(parent) = file.path.parent() { std::fs::create_dir_all(parent)?; } if file.path.exists() { std::fs::copy(&file.path, backup_path(&file.path)?)?; } - std::fs::write(&file.path, &file.contents)?; + let tmp_path = file.path.with_extension("tmp"); + std::fs::write(&tmp_path, &file.contents)?; + std::fs::rename(&tmp_path, &file.path)?; Ok(()) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fn write_planned_file(file: &PlannedFile) -> Result<(), SidecarError> {

if let Some(parent) = file.path.parent() {

std::fs::create_dir_all(parent)?;

}

if file.path.exists() {

std::fs::copy(&file.path, backup_path(&file.path)?)?;

}

std::fs::write(&file.path, &file.contents)?;

Ok(())

fn write_planned_file(file: &PlannedFile) -> Result<(), SidecarError> {

if let Some(parent) = file.path.parent() {

std::fs::create_dir_all(parent)?;

}

if file.path.exists() {

std::fs::copy(&file.path, backup_path(&file.path)?)?;

}

let tmp_path = file.path.with_extension("tmp");

std::fs::write(&tmp_path, &file.contents)?;

std::fs::rename(&tmp_path, &file.path)?;

Ok(())

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/installer.rs` around lines 655 - 663, The current write_planned_file function truncates the target in place; change it to write atomically by creating a temp file in the same directory (use file.path.parent()), write file.contents into that temp file, flush and persist it, then std::fs::rename the temp over file.path; keep the existing backup logic (backup_path(&file.path)?) before replacing, and on success remove the backup if needed; also preserve permissions/metadata by reading metadata from file.path (if it exists) and applying them to the temp file before rename, and ensure the temp file is cleaned up on failure.

coderabbitai · 2026-05-08T20:29:02Z

+    pub(crate) async fn start_llm(
+        &self,
+        headers: &HeaderMap,
+        start: LlmGatewayStart,
+    ) -> Result<ActiveLlm, SidecarError> {
+        let mut sessions = self.inner.lock().await;
+        let config = self.default_config.session_config_from_headers(headers);
+        let session_id = start
+            .session_id
+            .clone()
+            .or_else(|| single_active_session_id(&sessions))
+            .unwrap_or_else(|| format!("{}-gateway", AgentKind::Gateway.as_str()));
+        let session = sessions
+            .entry(session_id.clone())
+            .or_insert_with(|| Session::new(session_id, AgentKind::Gateway, config));
+        session.start_llm(start).await


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Do Not Reuse One Synthetic Session for All Headerless Gateway Traffic

When session_id is absent and there is not exactly one active hook session, every request falls into the same gateway-gateway bucket. That collapses unrelated traffic into shared correlation state and keeps that synthetic session alive after end_llm, so later requests can inherit stale hints. This fallback needs a per-request synthetic id plus cleanup after the LLM ends when no hook-owned session state exists.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/session.rs` around lines 175 - 190, The code currently falls back to a constant synthetic id ("gateway-gateway") when headers lack session_id and there isn't exactly one hook session, causing shared state; change the fallback to generate a unique per-request synthetic id (e.g., use a UUID or timestamp) when session_id is None and single_active_session_id(&sessions) returns None, create the Session via Session::new as now but mark it as synthetic (or track it locally), and after calling session.start_llm(start).await ensure cleanup: if the session was created as a synthetic gateway session and there is no hook-owned state left, remove that entry from the sessions map (or schedule removal) once the ActiveLlm ends so synthetic sessions don't persist and leak hints; key symbols: session_id variable, single_active_session_id, Session::new, session.start_llm, and the sessions (self.inner) map.

coderabbitai · 2026-05-08T20:29:02Z

+  "hooks": {
+    "SessionStart": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "UserPromptSubmit": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "PreToolUse": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "PostToolUse": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "PostToolUseFailure": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "AfterAgentResponse": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "AfterAgentThought": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "SubagentStart": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "SubagentStop": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "Notification": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "Stop": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "PreCompact": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "SessionEnd": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward codex",
+            "timeout": 30
+          }
+        ]
+      }
+    ]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail Open When the Sidecar Binary Is Missing

These hook entries call nemo-flow-sidecar directly, so a machine without that binary now turns ordinary Codex hooks into command failures. That breaks the integration’s optionality contract instead of preserving Codex’s native behavior. Please route these through a shim/launcher that exits successfully when NeMo Flow is unavailable.

As per coding guidelines, **/{integrations,integration,*-integration}/**: Preserve the framework's original behavior when NeMo Flow is absent.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integrations/coding-agents/codex/hooks/hooks.json` around lines 3 - 149, All hook entries currently invoke the nemo-flow-sidecar binary directly (the "command" values like "nemo-flow-sidecar hook-forward codex" used in hooks such as SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PostToolUseFailure, AfterAgentResponse, AfterAgentThought, SubagentStart/Stop, Notification, Stop, PreCompact, SessionEnd), which breaks optionality when the binary is missing; change each hook to call a small launcher/shim (e.g., nemo-flow-sidecar-launcher or nemo-flow-sidecar-shim) that attempts to exec the real nemo-flow-sidecar but exits successfully (0) and no-ops when the binary is not present so Codex hooks preserve native behavior. Ensure every "command" string in the JSON is updated to the shim name and that the shim implements the fail-open behavior.

coderabbitai · 2026-05-08T20:29:02Z

+  "hooks": {
+    "sessionStart": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "beforeSubmitPrompt": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "preToolUse": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "beforeShellExecution": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "beforeMCPExecution": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "postToolUse": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "afterShellExecution": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "afterMCPExecution": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "subagentStart": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "subagentStop": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "afterAgentResponse": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "afterAgentThought": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "preCompact": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "stop": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ],
+    "sessionEnd": [
+      {
+        "hooks": [
+          {
+            "type": "command",
+            "command": "nemo-flow-sidecar hook-forward cursor",
+            "timeout": 30
+          }
+        ]
+      }
+    ]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep the Cursor Hook Bundle Optional at Runtime

These entries invoke nemo-flow-sidecar unconditionally. If the binary is missing on a synced machine or falls off PATH, Cursor hook execution now fails instead of falling back to Cursor’s normal behavior. Please run this through a fail-open shim/launcher so the integration remains optional.

As per coding guidelines, **/{integrations,integration,*-integration}/**: Preserve the framework's original behavior when NeMo Flow is absent.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@integrations/coding-agents/cursor/.cursor/hooks.json` around lines 3 - 174, The hooks call the nemo-flow-sidecar binary unconditionally (every "command": "nemo-flow-sidecar hook-forward cursor"), causing failures when the binary is missing; create a fail-open launcher (e.g., nemo-flow-sidecar-shim) that checks for the real nemo-flow-sidecar and execs it if present, otherwise exits successfully (and optionally logs a debug message), then update all hook entries to invoke the shim (replace "nemo-flow-sidecar hook-forward cursor" with "nemo-flow-sidecar-shim hook-forward cursor") so the integration remains optional while preserving original behavior when NeMo Flow is absent.

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/sidecar/src/launcher.rs`:
- Around line 168-179: The current configured_command function uses
command.split_whitespace(), which drops quotes and corrupts quoted paths/args
(e.g., "C:\Program Files\..."), so replace the whitespace-only split with a
quote-aware parser: call a shell-style splitter (e.g., shell_words::split or
shlex::split) on the command string inside configured_command (refer to the
configured_command function and CodingAgent/AgentConfigs lookup), handle the
Result (return None or log/error on parse failure), and return the parsed
Vec<String> only when non-empty; this preserves quoted executables and arguments
on Windows and wrapper commands.
- Around line 482-496: The current insert_after_agent implementation can place
agent flags after argv[0] when the agent was specified via an explicit --agent
flag but no argv element matches CodingAgent::infer, causing wrappers to be
configured instead of the agent; update insert_after_agent (referencing argv,
agent, and CodingAgent::infer) to first search argv for an explicit --agent
occurrence (both "--agent" followed by a value and "--agent=VALUE" forms) and,
if found, insert the generated args immediately after the agent value token (or
the same token for --agent=VALUE); only if no explicit --agent token is present
fall back to finding the last infer-able agent index (as currently done) and
finally fall back to index 0.
- Around line 108-111: The health probe timeout currently discards any actual
server error by ignoring the result of running_server.stop().await; change the
error handling so that when wait_for_health(sidecar_url).await returns
Err(health_err), you call running_server.stop().await and if that returns
Err(server_err) return server_err (or otherwise wrap/preserve both, preferring
server_err) instead of always returning health_err; reference
RunningSidecar::start, wait_for_health, and running_server.stop().await to
locate where to capture and propagate the server-side error.

In `@crates/sidecar/tests/coverage/launcher_tests.rs`:
- Around line 223-252: The test mutates process cwd and currently restores it
with a trailing std::env::set_current_dir(previous) which fails to run on panic;
introduce a small RAII guard (e.g., CwdGuard) that saves the previous cwd on
creation and restores it in Drop, then replace the manual previous/save/restore
pattern in tests like cursor_patching_can_be_disabled (and the other listed
tests) by creating a CwdGuard after setting temp dir (use current_dir_lock() as
before) so the cwd is always restored even if the test panics or returns early;
ensure the guard is used in the scope where PreparedRun::new and assertions run.

In `@docs/integrate-frameworks/coding-agent-codex.md`:
- Around line 89-90: The provider display name currently uses "Nemo Flow"
casing; update the value for the TOML table [model_providers.nemo-flow-openai]
so the name key uses the official product capitalization "NeMo Flow" (e.g.,
change name = "Nemo Flow OpenAI" to name = "NeMo Flow OpenAI") to ensure
consistent product naming across docs.
- Around line 111-114: The docs currently reference deprecated keys
features.codex_hooks and codex_hooks; update the text so the transparent wrapper
sets the CLI override features.hooks=true (instead of features.codex_hooks) and
the persistent install writes .codex/config.toml with hooks = true (instead of
codex_hooks = true), while still describing that generated hook entries are
merged into .codex/hooks.json; update both occurrences to use the canonical
features.hooks and hooks configuration names and leave the hooks.json behavior
unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 7d4ca10f-58e2-4750-9c45-2c77d6195b69

📥 Commits

Reviewing files that changed from the base of the PR and between d336a47 and b35f658.

📒 Files selected for processing (4)

crates/sidecar/src/launcher.rs
crates/sidecar/tests/coverage/launcher_tests.rs
docs/integrate-frameworks/coding-agent-codex.md
integrations/coding-agents/codex/README.md

📜 Review details

🧰 Additional context used

📓 Path-based instructions (30)

**/*.{md,rst,html,txt}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)

**/*.{md,rst,html,txt}: Always spell NVIDIA in all caps. Do not use Nvidia, nvidia, nVidia, nVIDIA, or NV.
Use an NVIDIA before a noun because the name starts with an 'en' sound.
Do not add a registered trademark symbol after NVIDIA when referring to the company.
Use trademark symbols with product names only when the document type or legal guidance requires them.
Verify official capitalization, spacing, and hyphenation for product names.
Precede NVIDIA product names with NVIDIA on first mention when it is natural and accurate.
Do not rewrite product names for grammar or title-case rules.
Preserve third-party product names according to the owner's spelling.
Include the company name and full model qualifier on first use when it helps identify the model.
Preserve the official capitalization and punctuation of model names.
Use shorter family names only after the full name is established.
Spell out a term on first use and put the acronym in parentheses unless the acronym is widely understood by the intended audience.
Use the acronym on later mentions after it has been defined.
For long documents, reintroduce the full term if readers might lose context.
Form plurals of acronyms with s, not an apostrophe, such as GPUs.
In headings, common acronyms can remain abbreviated. Spell out the term in the first or second sentence of the body.
Common terms such as CPU, GPU, PC, API, and UI usually do not need to be spelled out for developer audiences.

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,rst,html}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)

Link the first mention of a product name when the destination helps the reader.

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,rst,txt}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-guide.md)

**/*.{md,rst,txt}: Spell NVIDIA in all caps. Do not use Nvidia, nvidia, or NV.
Format commands, code elements, expressions, package names, file names, and paths as inline code.
Use descriptive link text. Avoid raw URLs and weak anchors such as 'here' or 'read more.'
Use title case consistently for technical documentation headings.
Introduce code blocks, lists, tables, and images with complete sentences.
Write procedures as imperative steps. Keep steps parallel and split long procedures into smaller tasks.
Prefer active voice, present tense, short sentences, contractions, and plain English.
Use can for possibility and reserve may for permission.
Use after for temporal relationships instead of once.
Prefer refer to over see when the wording points readers to another resource.
Avoid culture-specific idioms, unnecessary Latinisms, jokes, and marketing exaggeration in technical documentation.
Spell out months in body text, avoid ordinal dates, and use clear time zones.
Spell out whole numbers from zero through nine unless they are technical values, parameters, versions, or UI values.
Use numerals for 10 or greater and include commas in thousands.
Do not add trademark symbols to learning-oriented documentation unless the source, platform, or legal guidance explicitly requires them.
Do not add trademark symbols to NeMo Flow learning documentation by default.
Do not rewrite API names, package names, command flags, or code literals for style reasons.

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,markdown,rst}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-technical-docs.md)

**/*.{md,markdown,rst}: Use title case consistently in technical documentation headings
Avoid quotation marks, ampersands, and exclamation marks in headings
Keep product, event, research, and whitepaper names in their official title case
Use title case for table headers
Do not force social-media sentence case into technical docs
Use monospace formatting for code elements, commands, parameters, package names, and expressions
Use monospace formatting for directories, file names, and paths
Use angle brackets inside monospace for variables inside paths, such as /home/<username>/.login
Use quotation marks for error messages and strings in documentation
Use bold formatting for UI buttons, menus, fields, and labels in documentation
Use angle brackets between UI labels for menu paths, such as File > Save As
Use italics for new terms on first use in documentation
Use italics for publication titles in documentation
Use plain text formatting for keyboard shortcuts in documentation
Prefer [NVIDIA/NeMo](link) format for GitHub repository references over generic phrases like 'the GitHub repo'
Introduce every code block with a complete sentence
Do not make a code block complete the grammar of the previous sentence
Do not continue a sentence after a code block
Use syntax highlighting when the format supports it for code blocks
Avoid the word 'snippet' unless the surrounding docs already use it as a term of art
Keep inline method, function, and class references consistent with nearby docs, omitting empty parentheses for prose readability when no call is shown
Use descriptive anchor text that matches the destination title when possible for links
Avoid raw URLs in running text in documentation
Avoid generic link anchors such as 'here,' 'this page,' and 'read more' in documentation
Include the acronym in link text if a linked term includes an acronym
Do not link long sentences or multiple sentences in documentation
Avoid links that pull readers away from a procedure unles...

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{html,md}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Include SPDX license headers in HTML and Markdown files using HTML comment syntax

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/README.md

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Update relevant crate or package README when that surface changed

Relevant package or crate README.md files must be updated when examples or binding guidance changes

Files:

integrations/coding-agents/codex/README.md

**/*.md

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Run Markdown link checking via lychee for README.md, CONTRIBUTING.md, and docs/ through pre-commit hooks

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,markdown,py,sh,bash,js,ts,java,cpp,go,rust}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep package names, repo references, and build commands current in documentation

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,markdown,py,sh,bash}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep stable user-facing wrappers at scripts/ root in docs and examples; only point at namespaced helper paths when documenting internal maintenance work

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,markdown,py,sh,bash,js,ts,example}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Example commands must match current package names and paths

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/{integrations,integration,*-integration}/**

📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)

**/{integrations,integration,*-integration}/**: Keep NeMo Flow optional in framework integrations
Preserve the framework's original behavior when NeMo Flow is absent
Wrap tool and LLM paths at the correct framework boundary
Integration pattern must follow docs/integrate-frameworks/adding-scopes.md

Files:

integrations/coding-agents/codex/README.md

**/*.{md,txt,rst}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)

**/*.{md,txt,rst}: Ensure commands, package names, file paths, and APIs in documentation are correct and not stale; flag incorrect or outdated information as blocking issues
Ensure examples and procedures in documentation will execute successfully with current APIs and commands
Use consistent user-facing terminology throughout documentation that matches current repo terminology
Capitalize NVIDIA correctly in all documentation and public-facing text
Format code, commands, paths, and filenames as inline code (monospace) in documentation
Use descriptive anchor text for links instead of bare URLs or weak labels like 'here' in documentation
Prefer active voice, present tense, short sentences, and plain English in documentation
Structure documentation procedures as imperative steps that are easy to scan and not too long for a single sequence
Prefer 'after' instead of 'once' for temporal references in documentation
Use 'can' instead of 'may' when describing possibility (rather than permission) in documentation
Avoid ambiguous numeric dates and ordinal dates in documentation body text

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

{README.md,docs/index.md,**/README.md}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)

Update entry-point documentation (README.md, docs/index.md, package READMEs, binding-level source READMEs) whenever public behavior changes

Files:

integrations/coding-agents/codex/README.md

**/*.{rs,py,js,ts,go,sh,toml,yaml,yml,json,md,rst}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md
crates/sidecar/src/launcher.rs
crates/sidecar/tests/coverage/launcher_tests.rs

**/*.{py,js,ts,tsx,go,rs,md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Format changed files with the language-native formatter before the final lint/test pass

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md
crates/sidecar/src/launcher.rs
crates/sidecar/tests/coverage/launcher_tests.rs

docs/**/*.md

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Run ./scripts/build-docs.sh for documentation site changes

docs/**/*.md: Relevant getting-started or reference docs must be updated when examples change
Release-policy docs must point to GitHub Releases as the only release-history source of truth

docs/**/*.md: Use title case for headings in technical documentation
Introduce code blocks, tables, and lists with complete lead-in sentences in documentation

Files:

docs/integrate-frameworks/coding-agent-codex.md

{RELEASING.md,CHANGELOG.md,docs/**/*.md}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep release-process and release-notes guidance in repo-maintainer docs such as RELEASING.md, not as user-facing docs pages or CHANGELOG.md

Files:

docs/integrate-frameworks/coding-agent-codex.md

docs/integrate-frameworks/**/*.md

📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)

Documentation must be updated if activation or usage of the integration changed

Files:

docs/integrate-frameworks/coding-agent-codex.md

{scripts/*.sh,docs/**/*.md}

📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)

Use root ./scripts/*.sh commands in docs and contributor guidance as documented, with implementations under scripts/third-party/

Files:

docs/integrate-frameworks/coding-agent-codex.md

{docs/**,examples/**,crates/adaptive/**,python/nemo_flow/**,go/nemo_flow/**,**/{example,component}.{ts,tsx,js,rs,py,go}}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Any new adaptive component kind must have documentation, examples, and binding coverage across all supported languages

Files:

docs/integrate-frameworks/coding-agent-codex.md

{README*,CHANGELOG*,docs/**/*.{md,rst,txt},examples/**/*,*.md}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update documentation, examples, and getting-started guides with new package/module/crate names after rename operations

Files:

docs/integrate-frameworks/coding-agent-codex.md

{README.md,docs/**/*.md,examples/**/*.{js,ts,py,go,rs}}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Keep documentation and examples synchronized with current install, import, and build commands

Files:

docs/integrate-frameworks/coding-agent-codex.md

{README.md,CONTRIBUTING.md,docs/**/*.md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

For docs-only changes, run targeted checks only if commands, package names, or examples changed. Use just docs for docs-site builds and just docs-linkcheck when links changed

Files:

docs/integrate-frameworks/coding-agent-codex.md

{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}

⚙️ CodeRabbit configuration file

{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}: Review documentation for technical accuracy against the current API, command correctness, and consistency across language bindings.
Flag stale examples, missing SPDX headers where required, and instructions that no longer match CI or pre-commit behavior.

Files:

docs/integrate-frameworks/coding-agent-codex.md

**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)

**/*.rs: Run cargo fmt --all for FFI work as it is Rust work
Run just test-rust for FFI validation
Run cargo clippy --workspace --all-targets -- -D warnings to enforce warnings-as-errors linting

When Rust files changed as part of Python work, run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for Rust code formatting
Run cargo clippy --workspace --all-targets -- -D warnings to enforce Rust linting with no warnings
Run just test-rust as the shared-runtime build/test wrapper for Rust changes

Use Rust snake_case naming convention for Rust code

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for Rust code formatting when Node changes touch Rust files
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting when Rust files changed as part of Node work

**/*.rs: Always run just test-rust when any Rust code changes
Always run cargo fmt --all when any Rust code changes
Always run cargo clippy --workspace --all-targets -- -D warnings when any Rust code changes

Files:

crates/sidecar/src/launcher.rs
crates/sidecar/tests/coverage/launcher_tests.rs

**/*.{rs,go,js,ts,tsx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Include SPDX license headers in all Rust, Go, JavaScript, and TypeScript source files using C-style comment syntax

Files:

crates/sidecar/src/launcher.rs
crates/sidecar/tests/coverage/launcher_tests.rs

**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Use SONAR_IGNORE_START / SONAR_IGNORE_END markers only for documented false positives that cannot be resolved in code; keep ignored blocks small, add explanatory comments, and require reviewer sign-off

Files:

crates/sidecar/src/launcher.rs
crates/sidecar/tests/coverage/launcher_tests.rs

**/*.{js,ts,tsx,jsx,py,rs,go,java,c,cpp,h,cc,cxx,cs,rb,php,swift,kt}

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changed files must be formatted with the language-native formatter

Files:

crates/sidecar/src/launcher.rs
crates/sidecar/tests/coverage/launcher_tests.rs

**/*.{rs,py,js,ts,tsx,go}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

During iteration, prefer uv run pre-commit run --files <changed files...> for targeted validation

Files:

crates/sidecar/src/launcher.rs
crates/sidecar/tests/coverage/launcher_tests.rs

{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

crates/sidecar/tests/coverage/launcher_tests.rs

🔇 Additional comments (2)

integrations/coding-agents/codex/README.md (2)
103-104: Use Official Product Capitalization in Provider Name

The user-facing provider label should use NeMo Flow, not Nemo Flow.
Suggested doc fix
-name = "Nemo Flow OpenAI"
+name = "NeMo Flow OpenAI"
As per coding guidelines, "Verify official capitalization, spacing, and hyphenation for product names."

33-35: ⚡ Quick win

Documentation is correct as written.

The configuration key documented in lines 34–35 is codex_hooks = true under the [features] table in .codex/config.toml, which matches the canonical syntax for codex-cli >= 0.129.0. The suggested change to features.hooks = true is incorrect; that key does not exist.

coderabbitai · 2026-05-08T22:51:57Z

+    let running_server = RunningSidecar::start(listener, sidecar_config);
+    if let Err(error) = wait_for_health(sidecar_url).await {
+        let _ = running_server.stop().await;
+        return Err(error);


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Surface the real sidecar startup failure.

If serve_listener exits immediately, the health probe times out and let _ = running_server.stop().await; discards the actual server error. That turns startup/config failures into a misleading readiness timeout.

Suggested fix

let running_server = RunningSidecar::start(listener, sidecar_config); if let Err(error) = wait_for_health(sidecar_url).await { - let _ = running_server.stop().await; - return Err(error); + return match running_server.stop().await { + Ok(()) => Err(error), + Err(server_error) => Err(SidecarError::Launch(format!( + "{server_error}; health check also failed: {error}" + ))), + }; }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/launcher.rs` around lines 108 - 111, The health probe timeout currently discards any actual server error by ignoring the result of running_server.stop().await; change the error handling so that when wait_for_health(sidecar_url).await returns Err(health_err), you call running_server.stop().await and if that returns Err(server_err) return server_err (or otherwise wrap/preserve both, preferring server_err) instead of always returning health_err; reference RunningSidecar::start, wait_for_health, and running_server.stop().await to locate where to capture and propagate the server-side error.

coderabbitai · 2026-05-08T22:51:57Z

+// Splits a configured command string into argv words for run mode. This intentionally uses simple
+// whitespace splitting because config command values are a convenience fallback; complex shell
+// commands should be passed after `--` by the caller.
+fn configured_command(agent: CodingAgent, agents: &AgentConfigs) -> Option<Vec<String>> {
+    let command = match agent {
+        CodingAgent::ClaudeCode => agents.claude_code.command.as_ref(),
+        CodingAgent::Codex => agents.codex.command.as_ref(),
+        CodingAgent::Cursor => agents.cursor.command.as_ref(),
+        CodingAgent::Hermes => agents.hermes.command.as_ref(),
+    }?;
+    let argv: Vec<_> = command.split_whitespace().map(ToOwned::to_owned).collect();
+    (!argv.is_empty()).then_some(argv)


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

split_whitespace() breaks configured commands.

Quoted executables and arguments are corrupted here, so a configured command like "C:\Program Files\Codex\codex.exe" --full-auto becomes invalid argv. That makes the configured-command fallback unreliable on Windows and for wrapper commands.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/launcher.rs` around lines 168 - 179, The current configured_command function uses command.split_whitespace(), which drops quotes and corrupts quoted paths/args (e.g., "C:\Program Files\..."), so replace the whitespace-only split with a quote-aware parser: call a shell-style splitter (e.g., shell_words::split or shlex::split) on the command string inside configured_command (refer to the configured_command function and CodingAgent/AgentConfigs lookup), handle the Result (return None or log/error on parse failure), and return the parsed Vec<String> only when non-empty; this preserves quoted executables and arguments on Windows and wrapper commands.

coderabbitai · 2026-05-08T22:51:57Z

+// Inserts generated agent flags immediately after the last argv element that looks like the agent
+// executable. Falling back to index 0 keeps wrapper commands usable by inserting after the first
+// word when the agent cannot be found later in argv.
+fn insert_after_agent(
+    argv: &mut Vec<String>,
+    agent: CodingAgent,
+    args: impl IntoIterator<Item = String>,
+) {
+    let index = argv
+        .iter()
+        .enumerate()
+        .filter_map(|(index, arg)| (CodingAgent::infer(arg) == Some(agent)).then_some(index))
+        .next_back()
+        .unwrap_or(0);
+    argv.splice(index + 1..index + 1, args);


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Opaque wrappers do not actually receive agent flags correctly.

When --agent is explicit but no argv element is recognizable as that agent, this inserts Codex/Claude flags after argv[0]. That means run --agent codex -- my-wrapper ... configures the wrapper process, not Codex, unless the wrapper forwards arbitrary argv unchanged. The contract in the preceding comment is stronger than what this implementation can guarantee.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/launcher.rs` around lines 482 - 496, The current insert_after_agent implementation can place agent flags after argv[0] when the agent was specified via an explicit --agent flag but no argv element matches CodingAgent::infer, causing wrappers to be configured instead of the agent; update insert_after_agent (referencing argv, agent, and CodingAgent::infer) to first search argv for an explicit --agent occurrence (both "--agent" followed by a value and "--agent=VALUE" forms) and, if found, insert the generated args immediately after the agent value token (or the same token for --agent=VALUE); only if no explicit --agent token is present fall back to finding the last infer-able agent index (as currently done) and finally fall back to index 0.

coderabbitai · 2026-05-08T22:51:57Z

+#[test]
+fn cursor_patching_can_be_disabled() {
+    let _guard = current_dir_lock().lock().unwrap();
+    let temp = tempfile::tempdir().unwrap();
+    let previous = std::env::current_dir().unwrap();
+    std::env::set_current_dir(temp.path()).unwrap();
+    let resolved = ResolvedConfig {
+        sidecar: SidecarConfig::default(),
+        agents: AgentConfigs {
+            cursor: CursorAgentConfig {
+                command: None,
+                patch_restore_hooks: false,
+            },
+            ..AgentConfigs::default()
+        },
+    };
+
+    let prepared = PreparedRun::new(
+        CodingAgent::Cursor,
+        vec!["cursor-agent".into()],
+        "http://s",
+        &resolved,
+        false,
+    )
+    .unwrap();
+
+    assert!(prepared.cursor_restore.is_none());
+    assert!(!Path::new(".cursor/hooks.json").exists());
+    std::env::set_current_dir(previous).unwrap();
+}


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Restore cwd with Drop, not a trailing statement.

These tests mutate process-global cwd and only reset it on the happy path. Any panic before the final set_current_dir(previous) leaves later tests running in the wrong directory.

Suggested pattern

+struct CwdGuard(PathBuf); + +impl Drop for CwdGuard { + fn drop(&mut self) { + let _ = std::env::set_current_dir(&self.0); + } +} + #[test] fn cursor_patching_can_be_disabled() { let _guard = current_dir_lock().lock().unwrap(); let temp = tempfile::tempdir().unwrap(); - let previous = std::env::current_dir().unwrap(); + let _cwd_guard = CwdGuard(std::env::current_dir().unwrap()); std::env::set_current_dir(temp.path()).unwrap(); // ... - std::env::set_current_dir(previous).unwrap(); }

As per coding guidelines, "Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant."

Also applies to: 313-351, 353-388, 390-413, 474-497

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/tests/coverage/launcher_tests.rs` around lines 223 - 252, The test mutates process cwd and currently restores it with a trailing std::env::set_current_dir(previous) which fails to run on panic; introduce a small RAII guard (e.g., CwdGuard) that saves the previous cwd on creation and restores it in Drop, then replace the manual previous/save/restore pattern in tests like cursor_patching_can_be_disabled (and the other listed tests) by creating a CwdGuard after setting temp dir (use current_dir_lock() as before) so the cwd is always restored even if the test panics or returns early; ensure the guard is used in the scope where PreparedRun::new and assertions run.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

crates/sidecar/src/gateway.rs (2)

545-549: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stop recording cookie and generic API-key aliases.

should_record_header still allows secrets like cookie, api-key, and similar aliases into LlmRequest.headers, which then flow into downstream exports.

Proposed fix

 fn should_record_header(name: &HeaderName) -> bool {
     should_forward_request_header(name)
         && name != http::header::AUTHORIZATION
+        && name != http::header::COOKIE
         && name.as_str() != "x-api-key"
+        && name.as_str() != "api-key"
+        && name.as_str() != "apikey"
         && name.as_str() != "anthropic-api-key"
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/sidecar/src/gateway.rs` around lines 545 - 549, The
should_record_header function currently permits secret headers like "cookie" and
generic API-key aliases; update should_record_header (the function taking
&HeaderName) to explicitly exclude cookie-related headers ("cookie" and
"set-cookie") and common API-key variants ("api-key", "apikey", and any other
vendor aliases such as "x-api-key" and "anthropic-api-key") and perform
comparisons case-insensitively (e.g., use name.as_str().eq_ignore_ascii_case)
while keeping the existing should_forward_request_header and the authorization
check.

503-565: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Honor Connection header tokens when filtering headers.

The proxy still strips only the fixed hop-by-hop names. If a request or response carries Connection: x-foo, x-foo is still eligible for forwarding and recording because is_hop_by_hop never inspects the current header map. That breaks proxy semantics and can leak per-connection metadata.

Parse the message’s Connection header into a per-request/per-response exclusion set and apply it in both the request and response filters.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/sidecar/src/gateway.rs` around lines 503 - 565, The code currently
strips only fixed hop-by-hop header names; you must honor per-message Connection
tokens by parsing the request/response HeaderMap's "connection" header into a
set of token names and treating those tokens as hop-by-hop too. Update
response_headers to first collect connection tokens from the provided HeaderMap
and exclude any header whose name matches those tokens (in addition to calling
is_hop_by_hop), and change the request-side predicates
should_forward_request_header and should_record_header to accept either the
HeaderMap or a precomputed set of connection token names so they can exclude
those tokens as well; keep the existing fixed-name checks (HOST, CONTENT_LENGTH,
ACCEPT_ENCODING, AUTHORIZATION, x-api-key, anthropic-api-key) and reuse or
refactor is_hop_by_hop into a helper that only tests fixed names while combining
it with the per-message token set when filtering.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/sidecar/src/gateway.rs`:
- Around line 245-259: The code currently calls upstream_response.bytes() and
fully buffers the upstream body (see upstream_response.bytes(), the bytes
variable and response_json usage), which can OOM; replace this with a shared
bounded reader that enforces a maximum buffer size (e.g., a configured
MAX_BODY_BYTES) and choose an explicit overflow policy: either fail-fast by
returning a SidecarError::UpstreamOverflow and emitting end_llm(active, {
"error": "body_size_limit" }, { "http_status": status.as_u16(), "streaming":
false, "gateway_error": true, "stage": "body" }), or switch to passthrough
streaming by stopping buffering once the cap is hit and proxying the remaining
bytes directly to the client; implement the cap logic where bytes() is called
and also apply the same bounded-reader behavior to the other occurrence (around
lines referenced 375-378) so both non-streaming paths share the same
bounded-read helper and overflow handling.

---

Duplicate comments:
In `@crates/sidecar/src/gateway.rs`:
- Around line 545-549: The should_record_header function currently permits
secret headers like "cookie" and generic API-key aliases; update
should_record_header (the function taking &HeaderName) to explicitly exclude
cookie-related headers ("cookie" and "set-cookie") and common API-key variants
("api-key", "apikey", and any other vendor aliases such as "x-api-key" and
"anthropic-api-key") and perform comparisons case-insensitively (e.g., use
name.as_str().eq_ignore_ascii_case) while keeping the existing
should_forward_request_header and the authorization check.
- Around line 503-565: The code currently strips only fixed hop-by-hop header
names; you must honor per-message Connection tokens by parsing the
request/response HeaderMap's "connection" header into a set of token names and
treating those tokens as hop-by-hop too. Update response_headers to first
collect connection tokens from the provided HeaderMap and exclude any header
whose name matches those tokens (in addition to calling is_hop_by_hop), and
change the request-side predicates should_forward_request_header and
should_record_header to accept either the HeaderMap or a precomputed set of
connection token names so they can exclude those tokens as well; keep the
existing fixed-name checks (HOST, CONTENT_LENGTH, ACCEPT_ENCODING,
AUTHORIZATION, x-api-key, anthropic-api-key) and reuse or refactor is_hop_by_hop
into a helper that only tests fixed names while combining it with the
per-message token set when filtering.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: a870b963-f006-4ee5-8568-231bf1fa4f3e

📥 Commits

Reviewing files that changed from the base of the PR and between b35f658 and 74189e8.

📒 Files selected for processing (1)

crates/sidecar/src/gateway.rs

📜 Review details

🧰 Additional context used

📓 Path-based instructions (7)

**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)

**/*.rs: Run cargo fmt --all for FFI work as it is Rust work
Run just test-rust for FFI validation
Run cargo clippy --workspace --all-targets -- -D warnings to enforce warnings-as-errors linting

When Rust files changed as part of Python work, run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for Rust code formatting
Run cargo clippy --workspace --all-targets -- -D warnings to enforce Rust linting with no warnings
Run just test-rust as the shared-runtime build/test wrapper for Rust changes

Use Rust snake_case naming convention for Rust code

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for Rust code formatting when Node changes touch Rust files
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting when Rust files changed as part of Node work

**/*.rs: Always run just test-rust when any Rust code changes
Always run cargo fmt --all when any Rust code changes
Always run cargo clippy --workspace --all-targets -- -D warnings when any Rust code changes

Files:

crates/sidecar/src/gateway.rs

**/*.{rs,go,js,ts,tsx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Include SPDX license headers in all Rust, Go, JavaScript, and TypeScript source files using C-style comment syntax

Files:

crates/sidecar/src/gateway.rs

**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Use SONAR_IGNORE_START / SONAR_IGNORE_END markers only for documented false positives that cannot be resolved in code; keep ignored blocks small, add explanatory comments, and require reviewer sign-off

Files:

crates/sidecar/src/gateway.rs

**/*.{js,ts,tsx,jsx,py,rs,go,java,c,cpp,h,cc,cxx,cs,rb,php,swift,kt}

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changed files must be formatted with the language-native formatter

Files:

crates/sidecar/src/gateway.rs

**/*.{rs,py,js,ts,go,sh,toml,yaml,yml,json,md,rst}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

crates/sidecar/src/gateway.rs

**/*.{py,js,ts,tsx,go,rs,md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Format changed files with the language-native formatter before the final lint/test pass

Files:

crates/sidecar/src/gateway.rs

**/*.{rs,py,js,ts,tsx,go}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

During iteration, prefer uv run pre-commit run --files <changed files...> for targeted validation

Files:

crates/sidecar/src/gateway.rs

coderabbitai · 2026-05-09T02:07:37Z

+    let bytes = match upstream_response.bytes().await {
+        Ok(bytes) => bytes,
+        Err(error) => {
+            sessions
+                .end_llm(
+                    active,
+                    json!({ "error": error.to_string() }),
+                    json!({ "http_status": status.as_u16(), "streaming": false, "gateway_error": true, "stage": "body" }),
+                )
+                .await?;
+            return Err(SidecarError::Upstream(error));
+        }
+    };
+    let response_json = serde_json::from_slice::<Value>(&bytes)
+        .unwrap_or_else(|_| json!({ "body_bytes": bytes.len() }));


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Cap buffered upstream responses.

Both non-streaming paths call reqwest::Response::bytes() and materialize the entire upstream body in memory. A large or malicious upstream response can OOM the sidecar before you ever emit the end event or proxy the payload back.

Please put these paths behind a shared bounded reader and decide on an overflow behavior explicitly: fail fast, or stop recording once the cap is reached and switch to passthrough streaming for the client response.

Also applies to: 375-378

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/gateway.rs` around lines 245 - 259, The code currently calls upstream_response.bytes() and fully buffers the upstream body (see upstream_response.bytes(), the bytes variable and response_json usage), which can OOM; replace this with a shared bounded reader that enforces a maximum buffer size (e.g., a configured MAX_BODY_BYTES) and choose an explicit overflow policy: either fail-fast by returning a SidecarError::UpstreamOverflow and emitting end_llm(active, { "error": "body_size_limit" }, { "http_status": status.as_u16(), "streaming": false, "gateway_error": true, "stage": "body" }), or switch to passthrough streaming by stopping buffering once the cap is hit and proxying the remaining bytes directly to the client; implement the cap logic where bytes() is called and also apply the same bounded-reader behavior to the other occurrence (around lines referenced 375-378) so both non-streaming paths share the same bounded-read helper and overflow handling.

Updated the Codex configuration to replace the deprecated `features.codex_hooks` with `features.hooks`, requiring `codex-cli >= 0.129.0`. Adjusted related tests and documentation to reflect this change. Signed-off-by: Ajay Thorve <athorve@nvidia.com>

Without this, upstream providers honored the client's Accept-Encoding and returned gzip/br/zstd-encoded bodies. The bytes were forwarded to the client unchanged (which decoded normally), but the sidecar's observability capture stored the still-compressed bytes in `output.value` on LLM spans and in ATIF trajectory bodies, leaving both unreadable for any downstream consumer (Phoenix, ATIF, ATOF). Stripping the header forces upstreams to return identity-encoded bodies. The bandwidth cost is paid only on the sidecar-upstream hop; the client never asked for the encoding it would have received, so its decoders never trigger. Closes NMF-84. Signed-off-by: Ajay Thorve <athorve@nvidia.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/sidecar/src/launcher.rs`:
- Around line 559-563: The toml_string function only escapes backslashes and
quotes, which can produce invalid TOML for values containing control characters
(newlines, tabs, etc.); update toml_string to produce correct TOML basic-string
escaping by either (1) replacing it with
toml::Value::String(value.to_string()).to_string() if the toml crate is
available, or (2) extend the escaping logic in toml_string to handle all
TOML-required escapes (e.g., \n, \t, \r, \uXXXX for control characters) so that
values like base_url always serialize into valid TOML.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: d340a09b-2be8-4446-841d-36be1689170c

📥 Commits

Reviewing files that changed from the base of the PR and between 74189e8 and eb16758.

📒 Files selected for processing (5)

crates/sidecar/src/gateway.rs
crates/sidecar/src/launcher.rs
crates/sidecar/tests/coverage/launcher_tests.rs
docs/integrate-frameworks/coding-agent-codex.md
integrations/coding-agents/codex/README.md

📜 Review details

🧰 Additional context used

📓 Path-based instructions (30)

**/*.{md,rst,html,txt}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)

**/*.{md,rst,html,txt}: Always spell NVIDIA in all caps. Do not use Nvidia, nvidia, nVidia, nVIDIA, or NV.
Use an NVIDIA before a noun because the name starts with an 'en' sound.
Do not add a registered trademark symbol after NVIDIA when referring to the company.
Use trademark symbols with product names only when the document type or legal guidance requires them.
Verify official capitalization, spacing, and hyphenation for product names.
Precede NVIDIA product names with NVIDIA on first mention when it is natural and accurate.
Do not rewrite product names for grammar or title-case rules.
Preserve third-party product names according to the owner's spelling.
Include the company name and full model qualifier on first use when it helps identify the model.
Preserve the official capitalization and punctuation of model names.
Use shorter family names only after the full name is established.
Spell out a term on first use and put the acronym in parentheses unless the acronym is widely understood by the intended audience.
Use the acronym on later mentions after it has been defined.
For long documents, reintroduce the full term if readers might lose context.
Form plurals of acronyms with s, not an apostrophe, such as GPUs.
In headings, common acronyms can remain abbreviated. Spell out the term in the first or second sentence of the body.
Common terms such as CPU, GPU, PC, API, and UI usually do not need to be spelled out for developer audiences.

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,rst,html}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)

Link the first mention of a product name when the destination helps the reader.

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,rst,txt}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-guide.md)

**/*.{md,rst,txt}: Spell NVIDIA in all caps. Do not use Nvidia, nvidia, or NV.
Format commands, code elements, expressions, package names, file names, and paths as inline code.
Use descriptive link text. Avoid raw URLs and weak anchors such as 'here' or 'read more.'
Use title case consistently for technical documentation headings.
Introduce code blocks, lists, tables, and images with complete sentences.
Write procedures as imperative steps. Keep steps parallel and split long procedures into smaller tasks.
Prefer active voice, present tense, short sentences, contractions, and plain English.
Use can for possibility and reserve may for permission.
Use after for temporal relationships instead of once.
Prefer refer to over see when the wording points readers to another resource.
Avoid culture-specific idioms, unnecessary Latinisms, jokes, and marketing exaggeration in technical documentation.
Spell out months in body text, avoid ordinal dates, and use clear time zones.
Spell out whole numbers from zero through nine unless they are technical values, parameters, versions, or UI values.
Use numerals for 10 or greater and include commas in thousands.
Do not add trademark symbols to learning-oriented documentation unless the source, platform, or legal guidance explicitly requires them.
Do not add trademark symbols to NeMo Flow learning documentation by default.
Do not rewrite API names, package names, command flags, or code literals for style reasons.

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,markdown,rst}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-technical-docs.md)

**/*.{md,markdown,rst}: Use title case consistently in technical documentation headings
Avoid quotation marks, ampersands, and exclamation marks in headings
Keep product, event, research, and whitepaper names in their official title case
Use title case for table headers
Do not force social-media sentence case into technical docs
Use monospace formatting for code elements, commands, parameters, package names, and expressions
Use monospace formatting for directories, file names, and paths
Use angle brackets inside monospace for variables inside paths, such as /home/<username>/.login
Use quotation marks for error messages and strings in documentation
Use bold formatting for UI buttons, menus, fields, and labels in documentation
Use angle brackets between UI labels for menu paths, such as File > Save As
Use italics for new terms on first use in documentation
Use italics for publication titles in documentation
Use plain text formatting for keyboard shortcuts in documentation
Prefer [NVIDIA/NeMo](link) format for GitHub repository references over generic phrases like 'the GitHub repo'
Introduce every code block with a complete sentence
Do not make a code block complete the grammar of the previous sentence
Do not continue a sentence after a code block
Use syntax highlighting when the format supports it for code blocks
Avoid the word 'snippet' unless the surrounding docs already use it as a term of art
Keep inline method, function, and class references consistent with nearby docs, omitting empty parentheses for prose readability when no call is shown
Use descriptive anchor text that matches the destination title when possible for links
Avoid raw URLs in running text in documentation
Avoid generic link anchors such as 'here,' 'this page,' and 'read more' in documentation
Include the acronym in link text if a linked term includes an acronym
Do not link long sentences or multiple sentences in documentation
Avoid links that pull readers away from a procedure unles...

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{html,md}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Include SPDX license headers in HTML and Markdown files using HTML comment syntax

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/README.md

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Update relevant crate or package README when that surface changed

Relevant package or crate README.md files must be updated when examples or binding guidance changes

Files:

integrations/coding-agents/codex/README.md

**/*.md

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Run Markdown link checking via lychee for README.md, CONTRIBUTING.md, and docs/ through pre-commit hooks

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,markdown,py,sh,bash,js,ts,java,cpp,go,rust}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep package names, repo references, and build commands current in documentation

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,markdown,py,sh,bash}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep stable user-facing wrappers at scripts/ root in docs and examples; only point at namespaced helper paths when documenting internal maintenance work

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/*.{md,markdown,py,sh,bash,js,ts,example}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Example commands must match current package names and paths

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

**/{integrations,integration,*-integration}/**

📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)

**/{integrations,integration,*-integration}/**: Keep NeMo Flow optional in framework integrations
Preserve the framework's original behavior when NeMo Flow is absent
Wrap tool and LLM paths at the correct framework boundary
Integration pattern must follow docs/integrate-frameworks/adding-scopes.md

Files:

integrations/coding-agents/codex/README.md

**/*.{md,txt,rst}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)

**/*.{md,txt,rst}: Ensure commands, package names, file paths, and APIs in documentation are correct and not stale; flag incorrect or outdated information as blocking issues
Ensure examples and procedures in documentation will execute successfully with current APIs and commands
Use consistent user-facing terminology throughout documentation that matches current repo terminology
Capitalize NVIDIA correctly in all documentation and public-facing text
Format code, commands, paths, and filenames as inline code (monospace) in documentation
Use descriptive anchor text for links instead of bare URLs or weak labels like 'here' in documentation
Prefer active voice, present tense, short sentences, and plain English in documentation
Structure documentation procedures as imperative steps that are easy to scan and not too long for a single sequence
Prefer 'after' instead of 'once' for temporal references in documentation
Use 'can' instead of 'may' when describing possibility (rather than permission) in documentation
Avoid ambiguous numeric dates and ordinal dates in documentation body text

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md

{README.md,docs/index.md,**/README.md}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)

Update entry-point documentation (README.md, docs/index.md, package READMEs, binding-level source READMEs) whenever public behavior changes

Files:

integrations/coding-agents/codex/README.md

**/*.{rs,py,js,ts,go,sh,toml,yaml,yml,json,md,rst}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md
crates/sidecar/src/gateway.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/launcher.rs

**/*.{py,js,ts,tsx,go,rs,md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Format changed files with the language-native formatter before the final lint/test pass

Files:

integrations/coding-agents/codex/README.md
docs/integrate-frameworks/coding-agent-codex.md
crates/sidecar/src/gateway.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/launcher.rs

docs/**/*.md

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Run ./scripts/build-docs.sh for documentation site changes

docs/**/*.md: Relevant getting-started or reference docs must be updated when examples change
Release-policy docs must point to GitHub Releases as the only release-history source of truth

docs/**/*.md: Use title case for headings in technical documentation
Introduce code blocks, tables, and lists with complete lead-in sentences in documentation

Files:

docs/integrate-frameworks/coding-agent-codex.md

{RELEASING.md,CHANGELOG.md,docs/**/*.md}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep release-process and release-notes guidance in repo-maintainer docs such as RELEASING.md, not as user-facing docs pages or CHANGELOG.md

Files:

docs/integrate-frameworks/coding-agent-codex.md

docs/integrate-frameworks/**/*.md

📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)

Documentation must be updated if activation or usage of the integration changed

Files:

docs/integrate-frameworks/coding-agent-codex.md

{scripts/*.sh,docs/**/*.md}

📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)

Use root ./scripts/*.sh commands in docs and contributor guidance as documented, with implementations under scripts/third-party/

Files:

docs/integrate-frameworks/coding-agent-codex.md

{docs/**,examples/**,crates/adaptive/**,python/nemo_flow/**,go/nemo_flow/**,**/{example,component}.{ts,tsx,js,rs,py,go}}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Any new adaptive component kind must have documentation, examples, and binding coverage across all supported languages

Files:

docs/integrate-frameworks/coding-agent-codex.md

{README*,CHANGELOG*,docs/**/*.{md,rst,txt},examples/**/*,*.md}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update documentation, examples, and getting-started guides with new package/module/crate names after rename operations

Files:

docs/integrate-frameworks/coding-agent-codex.md

{README.md,docs/**/*.md,examples/**/*.{js,ts,py,go,rs}}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Keep documentation and examples synchronized with current install, import, and build commands

Files:

docs/integrate-frameworks/coding-agent-codex.md

{README.md,CONTRIBUTING.md,docs/**/*.md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

For docs-only changes, run targeted checks only if commands, package names, or examples changed. Use just docs for docs-site builds and just docs-linkcheck when links changed

Files:

docs/integrate-frameworks/coding-agent-codex.md

{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}

⚙️ CodeRabbit configuration file

{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}: Review documentation for technical accuracy against the current API, command correctness, and consistency across language bindings.
Flag stale examples, missing SPDX headers where required, and instructions that no longer match CI or pre-commit behavior.

Files:

docs/integrate-frameworks/coding-agent-codex.md

**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)

**/*.rs: Run cargo fmt --all for FFI work as it is Rust work
Run just test-rust for FFI validation
Run cargo clippy --workspace --all-targets -- -D warnings to enforce warnings-as-errors linting

When Rust files changed as part of Python work, run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for Rust code formatting
Run cargo clippy --workspace --all-targets -- -D warnings to enforce Rust linting with no warnings
Run just test-rust as the shared-runtime build/test wrapper for Rust changes

Use Rust snake_case naming convention for Rust code

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for Rust code formatting when Node changes touch Rust files
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting when Rust files changed as part of Node work

**/*.rs: Always run just test-rust when any Rust code changes
Always run cargo fmt --all when any Rust code changes
Always run cargo clippy --workspace --all-targets -- -D warnings when any Rust code changes

Files:

crates/sidecar/src/gateway.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/launcher.rs

**/*.{rs,go,js,ts,tsx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Include SPDX license headers in all Rust, Go, JavaScript, and TypeScript source files using C-style comment syntax

Files:

crates/sidecar/src/gateway.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/launcher.rs

**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Use SONAR_IGNORE_START / SONAR_IGNORE_END markers only for documented false positives that cannot be resolved in code; keep ignored blocks small, add explanatory comments, and require reviewer sign-off

Files:

crates/sidecar/src/gateway.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/launcher.rs

**/*.{js,ts,tsx,jsx,py,rs,go,java,c,cpp,h,cc,cxx,cs,rb,php,swift,kt}

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changed files must be formatted with the language-native formatter

Files:

crates/sidecar/src/gateway.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/launcher.rs

**/*.{rs,py,js,ts,tsx,go}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

During iteration, prefer uv run pre-commit run --files <changed files...> for targeted validation

Files:

crates/sidecar/src/gateway.rs
crates/sidecar/tests/coverage/launcher_tests.rs
crates/sidecar/src/launcher.rs

{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

crates/sidecar/tests/coverage/launcher_tests.rs

🔇 Additional comments (12)

crates/sidecar/src/gateway.rs (4)

245-246: Upstream response bodies are still fully buffered without a cap.

Line 245 and Line 378 still call upstream_response.bytes().await, which can OOM on large/malicious upstream payloads. This was previously raised and remains unresolved.

Also applies to: 378-378

503-510: Connection-token hop-by-hop headers are not stripped.

Filtering only fixed hop-by-hop names misses headers listed dynamically in Connection, so those can still be forwarded/returned. This was previously raised and still applies.

Also applies to: 530-532, 554-565

545-550: Sensitive header recording filter is still incomplete.

should_record_header still allows values like cookie and generic API-key aliases to be recorded. This was already flagged and is still unresolved.

16-17: Good fix: request-body reads are now bounded.

Using MAX_BODY_BYTES with to_bytes closes the prior unbounded-request-body risk cleanly.

Also applies to: 68-70
crates/sidecar/src/launcher.rs (3)
475-480: Product name miscapitalized in user-facing Codex provider config.

Line 477 uses "Nemo Flow OpenAI" but the product name is "NeMo Flow OpenAI" (capital M).
-        "model_providers.nemo-flow-openai={{name=\"Nemo Flow OpenAI\",base_url={},wire_api=\"responses\",requires_openai_auth=true,supports_websockets=false}}",
+        "model_providers.nemo-flow-openai={{name=\"NeMo Flow OpenAI\",base_url={},wire_api=\"responses\",requires_openai_auth=true,supports_websockets=false}}",
108-111: Server startup error discarded on health probe timeout.

When wait_for_health fails, let _ = running_server.stop().await discards any real server error, turning startup/config failures into a misleading readiness timeout message.
Suggested fix
     let running_server = RunningSidecar::start(listener, sidecar_config);
     if let Err(error) = wait_for_health(sidecar_url).await {
-        let _ = running_server.stop().await;
-        return Err(error);
+        return match running_server.stop().await {
+            Ok(()) => Err(error),
+            Err(server_error) => Err(SidecarError::Launch(format!(
+                "{server_error}; health check also failed: {error}"
+            ))),
+        };
     }
168-180: split_whitespace() breaks quoted paths in configured commands.

Paths with spaces (e.g., "C:\Program Files\Codex\codex.exe" --full-auto) will be corrupted. The comment acknowledges this as a convenience tradeoff, but consider using a shell-style splitter like shlex::split for cross-platform robustness.
integrations/coding-agents/codex/README.md (1)
103-104: Product name miscapitalized.

Line 104 uses "Nemo Flow OpenAI" but the product name is "NeMo Flow OpenAI".
-name = "Nemo Flow OpenAI"
+name = "NeMo Flow OpenAI"
docs/integrate-frameworks/coding-agent-codex.md (2)
89-90: Product name miscapitalized.

Line 90 uses "Nemo Flow OpenAI" but the product name is "NeMo Flow OpenAI".
-name = "Nemo Flow OpenAI"
+name = "NeMo Flow OpenAI"
111-114: Documentation references deprecated configuration keys.

Lines 112-113 document features.codex_hooks=true and codex_hooks = true, but the implementation at launcher.rs:321 uses the canonical features.hooks=true. This inconsistency will confuse users.
Suggested fix
 The transparent wrapper passes hook entries as Codex CLI config overrides and
-sets `features.codex_hooks=true` for that launched process. Persistent install
-writes `.codex/config.toml` with `codex_hooks = true` and merges generated hook
+sets `features.hooks=true` for that launched process. Persistent install
+writes `.codex/config.toml` with `hooks = true` and merges generated hook
 entries into `.codex/hooks.json`.
crates/sidecar/tests/coverage/launcher_tests.rs (2)
223-252: Process cwd restoration is not panic-safe.

Tests that mutate the process working directory (lines 227-228, 251) restore it with a trailing set_current_dir(previous). If the test panics before reaching line 251, subsequent tests run in the wrong directory. Use a RAII guard pattern.
Suggested pattern
struct CwdGuard(PathBuf);

impl Drop for CwdGuard {
    fn drop(&mut self) {
        let _ = std::env::set_current_dir(&self.0);
    }
}

#[test]
fn cursor_patching_can_be_disabled() {
    let _guard = current_dir_lock().lock().unwrap();
    let temp = tempfile::tempdir().unwrap();
    let _cwd_guard = CwdGuard(std::env::current_dir().unwrap());
    std::env::set_current_dir(temp.path()).unwrap();
    // ... test body ...
    // No manual restore needed; CwdGuard::drop handles it
}
This pattern should be applied to all affected tests: cursor_patching_can_be_disabled, cursor_patch_restore_restores_original_file, cursor_patch_restore_uses_nearest_project_cursor_dir, cursor_patch_restore_removes_temporary_file, and cursor_dry_run_does_not_write_hooks.

1-593: LGTM on test coverage.

The test suite comprehensively covers agent inference, configured command resolution, agent-specific preparation (Codex, Claude, Hermes, Cursor), cursor hook patch/restore scenarios, dry-run behavior, and async runtime behavior including sidecar URL injection and exit code propagation. The OS-specific fake agent commands are well-structured for cross-platform testing.

coderabbitai · 2026-05-09T02:49:28Z

+// Escapes a Rust string as a TOML basic string for inline Codex configuration values.
+fn toml_string(value: &str) -> String {
+    let escaped = value.replace('\\', "\\\\").replace('"', "\\\"");
+    format!("\"{escaped}\"")
+}


🧹 Nitpick | 🔵 Trivial | 💤 Low value

TOML string escaping is minimal.

toml_string escapes only \ and ". TOML basic strings also require escaping control characters (newlines, tabs, etc.). While unlikely in URLs, a base_url containing such characters would produce invalid TOML.

Consider using the toml crate for serialization

If the toml crate is already a dependency, use its serialization to guarantee correctness:

fn toml_string(value: &str) -> String { toml::Value::String(value.to_string()).to_string() }

Otherwise, the current implementation is acceptable for typical URL values.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/launcher.rs` around lines 559 - 563, The toml_string function only escapes backslashes and quotes, which can produce invalid TOML for values containing control characters (newlines, tabs, etc.); update toml_string to produce correct TOML basic-string escaping by either (1) replacing it with toml::Value::String(value.to_string()).to_string() if the toml crate is available, or (2) extend the escaping logic in toml_string to handle all TOML-required escapes (e.g., \n, \t, \r, \uXXXX for control characters) so that values like base_url always serialize into valid TOML.

…odex) Three independent issues silently disabled transparent-run hooks for Claude Code; this commit fixes all three: 1. Hook subprocesses couldn't find the sidecar binary on PATH. `hook_forward_command` returned the bare name `nemo-flow-sidecar`, which Claude/Codex/Cursor passed to their hook subprocesses. Those subprocesses inherit the agent's environment, where the sidecar's install location (e.g. target/debug) typically isn't on PATH, so every hook exited with status 127 (command not found). Add an `executable` parameter so transparent-run callers can pass `current_exe()`; persistent install keeps the bare name. 2. Claude Code 2.1.x's hook loader strict-rejects unknown event names — any single unknown event causes the ENTIRE hooks file to be rejected silently, disabling all hooks. We were injecting `AfterAgentResponse` and `AfterAgentThought`, neither in Claude's whitelist. Drop both. Also add `PermissionRequest` and `PostCompact`, which are in Claude's whitelist and in Codex 0.129's event surface. 3. Claude's hook output validator rejects null for optional string fields. The Claude adapter's Stop response carried `stopReason: null` and was rejected with a runtime validation error. Omit the field entirely; the response collapses to the generic `{ continue: true }`. Also regenerate the static integration bundles for claude-code and codex with the corrected event list. The same whitelist mismatch broke persistent install for Claude users. Verified end-to-end against Claude Code 2.1.137: - SessionStart, UserPromptSubmit, PreToolUse:WebSearch, PostToolUse:WebSearch, Stop, SubagentStop all fire successfully - Phoenix shows [AGENT] claude-code root span with tool spans - ATIF written on session-end (1.7 MB / 1.3 MB across two sessions) Codex 0.129 lifecycle (SessionStart through PreToolUse/PostToolUse) now also fires correctly — Phoenix shows [AGENT] codex root with Bash tool spans. Codex transparent-run ATIF flush still pending: codex 0.129 has no SessionEnd-equivalent hook, so end_agent never runs without a wrapper-driven flush. Tracking that piece separately. Signed-off-by: Ajay Thorve <athorve@nvidia.com>

…gateway" Previously, when a real agent's gateway request beat its SessionStart hook, `SessionManager::start_llm` created the session with `AgentKind::Gateway`. Observer identities (ATIF agent name, Phoenix scope name) are baked at scope-open time inside `ensure_agent_started`, so the session stayed labeled "gateway" for its entire lifetime even after SessionStart arrived. Result: Phoenix root spans for real Claude/Codex sessions read as `[AGENT] gateway` instead of `[AGENT] claude-code` / `[AGENT] codex`. Two targeted fixes: 1. In `SessionManager::start_llm`, infer the agent kind from the gateway provider when creating a new session: `anthropic.messages` and `anthropic.count_tokens` → `ClaudeCode`; `openai.responses` → `Codex`. Unknown providers (e.g., `openai.chat_completions`, `openai.models`) keep the legacy `Gateway` label so synthetic gateway-only sessions for unattributed proxy traffic remain semantically intact. 2. In `SessionManager::apply_events`, when an `AgentStarted` event arrives for an existing session whose `agent_kind` is still `Gateway` (i.e., the gateway path created the session before the hook), upgrade the session's `agent_kind` to the event's real kind. This fixes session-level metadata for the rare case where provider inference fell back to `Gateway` (e.g., chat-completions traffic) but a SessionStart hook later identifies the agent. Tests cover all three branches: anthropic gateway → claude-code label, openai.responses gateway → codex label, ambiguous provider → gateway label preserved. Signed-off-by: Ajay Thorve <athorve@nvidia.com>

…nd hook flag Three small fixes from CodeRabbit's pass on PR NVIDIA#63: 1. `Nemo Flow OpenAI` → `NeMo Flow OpenAI` in the codex transparent-run provider config (`launcher.rs:477`) and matching docs in `docs/integrate-frameworks/coding-agent-codex.md` and `integrations/coding-agents/codex/README.md`. Use the official NeMo Flow capitalization in user-facing TOML config and docs. 2. Update codex hook configuration documentation to use the canonical `features.hooks` / `hooks = true` keys instead of the legacy `features.codex_hooks` / `codex_hooks = true`. The legacy spelling is still accepted as an alias on codex 0.129+ but the canonical form is what new users should configure. Retain a one-line note that the old spelling is the legacy alias so users on older codex still understand the deprecation path. Resolves CodeRabbit comments 3211156569, 3211760910, 3211760919, 3211156599. No code behavior changes; 128 sidecar tests still pass. Signed-off-by: Ajay Thorve <athorve@nvidia.com>

Codex 0.129's hook surface has no `SessionEnd`-equivalent event — its closest session-termination signal, `Stop`, fires per-conversation-turn rather than at session end. As a result, codex transparent runs through `nemo-flow-sidecar run -- codex` produced empty `.nemo-flow/atif/` directories: the existing `flush_observers` path (`end_agent` → `write_atif`) never fired. Add a new `NormalizedEvent::TurnEnded` variant. When `classify` sees a `Stop` hook AND the primary classification is not already terminal, it emits both the existing `LlmHint` (preserving correlation behavior for subsequent LLM calls) AND a `TurnEnded` event. `Session::apply` routes `TurnEnded` to a new `snapshot_atif` method that writes the cumulative ATIF without closing the agent scope or shutting down the OpenInference subscriber. The next turn keeps adding events; the final Stop's snapshot is the complete trajectory. The `is_terminal()` guard is critical: Cursor's adapter classifies `stop` as `AgentEnded`, so without the guard we'd emit `[AgentEnded, TurnEnded]` — the first call tears down the session and writes ATIF, the second creates an empty synthetic session and overwrites the freshly-written ATIF with an empty trajectory. Safety verified by reading `AtifExporter`: `export()` takes `&self`, documented as "does not clear the buffered events." Multiple snapshot writes are cumulative supersets, last-write-wins yields the complete trajectory. E2E verified against codex 0.129: - Real codex session writes a 1.6 MB ATIF on Ctrl+D exit (was 0 bytes before) - Multi-turn captured (5 steps: 2 user, 2 agent, 1 system) - `agent.name = "codex"` (from earlier provider-inference fix) - Phoenix root remains `[AGENT] codex` with proper nested LLM/tool spans Hermes is unaffected — it has no `Stop` hook event. Partially closes NMF-91. The remaining cross-process subagent unification gap in NMF-91 stays upstream-blocked: codex spawns subagents as separate codex processes with no `SubagentStart` hook, so each subagent's gateway-only synthetic session is still its own Phoenix trace and its own ATIF (or no ATIF when no Stop fires for it). Tests: - adapter: codex Stop emits TurnEnded - adapter: claude Stop emits TurnEnded - adapter: cursor stop does NOT also emit TurnEnded (regression: would wipe ATIF) - session: TurnEnded snapshots ATIF without closing scope; cumulative across turns - session: TurnEnded is a no-op when the session has no agent scope Signed-off-by: Ajay Thorve <athorve@nvidia.com>

Defines the streaming counterpart to LlmResponseCodec, used with llm_stream_call_execute. Each provider's StreamingCodec impl owns the incremental state needed to assemble per-chunk events into a single non-streaming-shape JSON payload, which the matching LlmResponseCodec then decodes into AnnotatedLlmResponse. Per Will's design suggestion in NMF-90: trait StreamingCodec { fn collector(&self) -> LlmCollectorFn; fn finalizer(&self) -> LlmFinalizerFn; } Includes an SseEventDecoder utility that incrementally splits a text/event-stream byte stream into parsed SSE events. Anthropic Messages, OpenAI Responses, and OpenAI Chat Completions all emit one JSON object per data: line, so the decoder is shared infrastructure for the per-provider StreamingCodec impls landing in follow-up commits. Refs NMF-90. Signed-off-by: Ajay Thorve <athorve@nvidia.com>

Implements StreamingCodec for the Anthropic Messages SSE event sequence. The collector observes message_start, content_block_start/delta/stop, message_delta, and message_stop events, accumulating per-block state (text, tool input partial_json, citations) keyed by content_block index. The finalizer produces a JSON object wire-compatible with RawAnthropicResponse — the same shape Anthropic returns for a non-streaming Messages request — so the assembled output can be fed back through AnthropicMessagesCodec::decode_response to produce the canonical AnnotatedLlmResponse. Streaming and non-streaming Anthropic requests now converge on the same observability output without per-route shape duplication. Internal state lives behind Arc<Mutex<...>> so the &self-produced collector and finalizer closures share access. Each instance is single-use because LlmFinalizerFn consumes the finalize step. Tests cover: - text-only response with usage and stop_reason mapping - tool_use input assembled from partial_json fragments - web_search_tool_result blocks (full content delivered at start) - text blocks with citations_delta accumulation - truncated streams falling back to raw partial_json string Refs NMF-90. Signed-off-by: Ajay Thorve <athorve@nvidia.com>

Implements StreamingCodec for the OpenAI Responses SSE event sequence. The Responses API streams either complete `response` snapshots (response.created, response.in_progress, response.completed, response.failed, response.incomplete) or final-state output items (response.output_item.done), so the codec: 1. Tracks the latest response snapshot — terminal events typically carry the complete state including `output`. 2. Tracks output items by output_index from output_item.done events, used as a fallback when the terminal snapshot lacks `output`. 3. Ignores per-token deltas (output_text.delta, function_call_arguments.delta) because their content is redelivered in the matching output_item.done. The finalizer produces a JSON object wire-compatible with RawResponsesResponse — feed it back through OpenAIResponsesCodec:: decode_response to produce the canonical AnnotatedLlmResponse. Tests cover: - terminal response.completed carries full state (common case) - terminal completed lacks output → fall back to output_item.done items - response.incomplete with reason maps to FinishReason::Length - per-token deltas don't double-accumulate Refs NMF-90. Signed-off-by: Ajay Thorve <athorve@nvidia.com>

Implements StreamingCodec for the OpenAI Chat Completions SSE chunk sequence. Each delta chunk may carry role (typically only the first), incremental content fragments, partial tool_calls whose function.arguments stream as a string, or a finish_reason terminator. Top-level fields (id, object, created, model) are repeated on every chunk; we capture them once. Final-chunk usage is preserved when emitted (only sent when stream_options.include_usage is set). The finalizer produces a JSON object wire-compatible with a non-streaming Chat Completions response — feed it back through OpenAIChatCodec::decode_response to produce the canonical AnnotatedLlmResponse. The `object` field is normalized from `chat.completion.chunk` to `chat.completion` so the assembled JSON matches what a non-streaming response would carry. Also: SseEventDecoder now drops the `data: [DONE]` end-of-stream sentinel that OpenAI Chat Completions emits. It's a wire-level marker, not a JSON payload, and would otherwise fail JSON parsing. Tests cover: - text-only response with usage and finish_reason mapping - tool_call arguments assembled from string fragments - content: null preserved when only tool calls streamed - multiple choices (n > 1 requests) tracked separately - intermediate null usage chunks don't overwrite real usage Refs NMF-90. Signed-off-by: Ajay Thorve <athorve@nvidia.com>

Match the project's existing test layout: every other codec file in crates/core/src/codec/ pulls its tests from crates/core/tests/unit/codec/<name>_tests.rs via #[path]. The streaming codec tests were inlined in src/codec/streaming.rs in the initial commit; move them out to follow the same convention so source files stay slim and all codec test files cluster under tests/unit/codec/. No behavior change. 283 core lib tests still passing. Refs NMF-90. Signed-off-by: Ajay Thorve <athorve@nvidia.com>

Refactors the gateway proxy to use llm_call_execute and llm_stream_call_execute so the runtime owns LLM start/end events, codec annotation, and stream lifecycle. Per-route dispatch wires Anthropic Messages, OpenAI Responses, and OpenAI Chat Completions to their existing typed codecs and the new streaming codecs landed earlier in this PR. A new SessionManager::prepare_gateway_call helper resolves the right session, opens the agent scope, and computes correlation metadata without holding the session lock during the upstream HTTP work. The buffered path returns the captured upstream bytes verbatim; the streaming path re-encodes parsed events back into SSE frames for the client (Option B). Upstream connection failures are still surfaced as 502 Bad Gateway via a side channel that captures the original reqwest error before the runtime collapses it to FlowError::Internal. Trade-off A vs B for streaming response forwarding: A. Byte tee: split the raw upstream byte stream between the client and a side capture that feeds the runtime collector. Bytes pass through unchanged, but observability hooks see the literal wire bytes (gzip/identity) rather than parsed events, and the collector code has to duplicate per-provider SSE framing logic just for capture. B. Re-encode (chosen): parse SSE on the upstream side, feed parsed JSON events to the runtime collector, then re-encode events as SSE for the client. Adds one parse + one serialize per chunk, but the runtime sees identical event shapes for streaming and non-streaming requests, and chunk-level observability matches the codec's annotated response. The legacy SessionManager::start_llm / end_llm path is retained behind cfg(test) so existing correlation unit tests continue to exercise resolve_llm_owner. Streaming tool-hint extraction is intentionally deferred because the runtime synthesizes the aggregate response inside LlmStreamWrapper and does not surface it back to the gateway. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Ajay Thorve <athorve@nvidia.com>

…on-lifecycle

Salonijain27

Approved from a dependency point of view

Add coding agent sidecar integrations

5f35146

Signed-off-by: Will Killian <wkillian@nvidia.com>

github-actions Bot added size:XXL PR is very large feat PR introduces new feature or functionality lang:rust PR changes/introduces Rust code labels May 5, 2026

willkill07 self-assigned this May 5, 2026

willkill07 added this to the 0.2.0 milestone May 5, 2026

willkill07 and others added 5 commits May 5, 2026 16:06

chore: publish sidecar crate in release helpers

c508ae0

Signed-off-by: Will Killian <wkillian@nvidia.com>

test: increase sidecar coverage

8d95966

Signed-off-by: Will Killian <wkillian@nvidia.com>

Add transparent sidecar execution

afe8a88

Signed-off-by: Will Killian <wkillian@nvidia.com>

Fix sidecar review findings

c087437

Signed-off-by: Will Killian <wkillian@nvidia.com>

willkill07 force-pushed the wkk_coding-agent-sidecar-integrations branch from 9594050 to 841d583 Compare May 6, 2026 18:53

willkill07 and others added 5 commits May 6, 2026 14:54

Merge branch 'main' into wkk_coding-agent-sidecar-integrations

aaf8c8c

feat(sidecar): capture Hermes API hook token metrics

173bca9

Signed-off-by: GSD Agent <bbednarski@nvidia.com>

feat: add LLM and Tool best-effort correlation; Hermes docs

c1be4ee

Signed-off-by: Will Killian <wkillian@nvidia.com>

feat(sidecar): capture Hermes API hook token metrics

bfd9f73

Signed-off-by: GSD Agent <bbednarski@nvidia.com>

ci: add sidecar binary publishing

1d896bf

Signed-off-by: Will Killian <wkillian@nvidia.com>

willkill07 force-pushed the wkk_coding-agent-sidecar-integrations branch from 173bca9 to 1d896bf Compare May 6, 2026 23:14

willkill07 and others added 4 commits May 6, 2026 19:29

test: make sidecar launcher test portable

23315c0

Signed-off-by: Will Killian <wkillian@nvidia.com>

Merge branch 'main' into wkk_coding-agent-sidecar-integrations

f5f9a38

ci(windows): get build to pass

2201962

Signed-off-by: Will Killian <wkillian@nvidia.com>

Merge branch 'main' into wkk_coding-agent-sidecar-integrations

b83d514

Signed-off-by: Will Killian <2007799+willkill07@users.noreply.github.com>

github-actions Bot added the lang:python PR changes/introduces Python code label May 7, 2026

bbednarski9 added 2 commits May 7, 2026 13:16

feat(python): support response annotations on manual LLM span end

7879885

Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>

bbednarski9 force-pushed the wkk_coding-agent-sidecar-integrations branch from bf50036 to 7879885 Compare May 7, 2026 20:16

Merge remote-tracking branch 'upstream/main' into wkk_coding-agent-si…

dec9bfd

…decar-integrations Signed-off-by: Will Killian <wkillian@nvidia.com> # Conflicts: # ATTRIBUTIONS-Rust.md # Cargo.lock

willkill07 marked this pull request as ready for review May 8, 2026 17:50

willkill07 requested review from a team as code owners May 8, 2026 17:50

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

willkill07 added 2 commits May 8, 2026 15:59

fix: address sidecar and codec review findings

5742164

Signed-off-by: Will Killian <wkillian@nvidia.com>

fix(sidecar): route Codex through provider alias

d336a47

Signed-off-by: Will Killian <wkillian@nvidia.com>

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

AjayThorve reviewed May 8, 2026

View reviewed changes

Comment thread crates/sidecar/src/launcher.rs Outdated

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

coderabbitai Bot reviewed May 9, 2026

View reviewed changes

AjayThorve added 2 commits May 8, 2026 19:38

AjayThorve force-pushed the wkk_coding-agent-sidecar-integrations branch from 74189e8 to eb16758 Compare May 9, 2026 02:38

coderabbitai Bot reviewed May 9, 2026

View reviewed changes

AjayThorve and others added 11 commits May 8, 2026 19:51

Merge pull request #1 from AjayThorve/ajay/nmf-90-managed-llm-executi…

5a193ea

…on-lifecycle

Salonijain27 approved these changes May 9, 2026

View reviewed changes

Conversation

willkill07 commented May 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Details

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

copy-pr-bot Bot commented May 7, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

willkill07 commented May 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 5, 2026 •

edited

Loading