feat: add coding-agent sidecar integrations#63
feat: add coding-agent sidecar integrations#63willkill07 wants to merge 33 commits intoNVIDIA:mainfrom
Conversation
Signed-off-by: Will Killian <wkillian@nvidia.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughIntroduces a new nemo-flow-sidecar crate (server, adapters, gateway, session manager), installer/launcher CLI, Python llm_call_end annotation/codec support, CI/publishing and Codecov updates, ATIF exporter improvements, extensive tests, docs, and packaged coding-agent integration bundles. ChangesSidecar crate, API, CI, tests, and docs
Sequence Diagram(s)sequenceDiagram
participant Agent as Coding Agent
participant Sidecar as Sidecar Server
participant Upstream as Provider API
participant Sess as SessionManager
Agent->>Sidecar: POST /hooks/{agent} (hook JSON)
Sidecar->>Sess: apply_events(normalized)
Agent->>Sidecar: POST /v1/chat/completions
Sidecar->>Sess: start_llm(request, headers)
Sidecar->>Upstream: forward request
Upstream-->>Sidecar: response/stream
Sidecar->>Sess: end_llm(response/meta)
Sidecar-->>Agent: proxied response/vendor hook reply
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
|
Signed-off-by: Will Killian <wkillian@nvidia.com>
Signed-off-by: Will Killian <wkillian@nvidia.com>
Signed-off-by: Will Killian <wkillian@nvidia.com>
Signed-off-by: Will Killian <wkillian@nvidia.com>
- Added Hermes as a first-class sidecar agent: --agent hermes, config support, command inference for hermes / hermes-agent, and AgentKind::Hermes. - Added a dedicated Hermes hook endpoint: /hooks/hermes now routes Hermes shell-hook payloads into the sidecar session manager. - Added a Hermes adapter: Maps Hermes lifecycle, tool, and subagent shell hooks into normalized NeMo Flow sidecar events. - Added Hermes installer support: nemo-flow-sidecar install hermes writes/merges .hermes/config.yaml hook config using YAML instead of JSON/TOML. - Added dynamic sidecar URL handling for Hermes: Hermes installed hooks prefer runtime NEMO_FLOW_SIDECAR_URL, so nemo-flow-sidecar run --agent hermes can use an ephemeral sidecar URL without reinstalling hooks. - Preserved Hermes hook consent semantics: The runner exports the dynamic sidecar URL but does not auto-enable HERMES_ACCEPT_HOOKS. - Improved ATIF LLM request fidelity: ATIF user steps now preserve the full unwrapped LLM request payload under extra.llm_request. - Improved ATIF token metrics extraction: Supports provider-native usage objects in addition to NeMo Flow token_usage, including cached-token variants from OpenAI/Anthropic-style payloads. - Expanded shared adapter extraction: Adds support for Hermes-friendly payload fields like task_id, parent_session_id, extra.tool_call_id, and extra.result. - Added test coverage: New/updated tests cover Hermes adapter mapping, config parsing, installer generation/merge behavior, runtime launcher behavior, server hook response shape, and ATIF request/usage export. Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com> Signed-off-by: Will Killian <wkillian@nvidia.com>
9594050 to
841d583
Compare
Signed-off-by: GSD Agent <bbednarski@nvidia.com>
Signed-off-by: Will Killian <wkillian@nvidia.com>
Signed-off-by: GSD Agent <bbednarski@nvidia.com>
Signed-off-by: Will Killian <wkillian@nvidia.com>
173bca9 to
1d896bf
Compare
Signed-off-by: Will Killian <wkillian@nvidia.com>
Signed-off-by: Will Killian <wkillian@nvidia.com>
Signed-off-by: Will Killian <2007799+willkill07@users.noreply.github.com>
…egrations' into wkk_coding-agent-sidecar-integrations # Conflicts: # crates/sidecar/src/model.rs # crates/sidecar/src/session.rs # crates/sidecar/tests/coverage/installer_tests.rs # crates/sidecar/tests/coverage/session_tests.rs
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
bf50036 to
7879885
Compare
…decar-integrations Signed-off-by: Will Killian <wkillian@nvidia.com> # Conflicts: # ATTRIBUTIONS-Rust.md # Cargo.lock
There was a problem hiding this comment.
Actionable comments posted: 15
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
codecov.yml (1)
46-67:⚠️ Potential issue | 🔴 Critical | ⚖️ Poor tradeoffDuplicate path configuration in coverage components.
The path
crates/sidecar/srcappears in both therust_runtimecomponent (line 51) and the newsidecar_runtimecomponent (line 61). This duplication will cause the same source files to be tracked under two separate components, which can lead to confusing coverage reports and potentially inflated metrics.Choose one of the following:
- If the sidecar should be part of the unified Rust runtime component, remove the standalone
sidecar_runtimecomponent (lines 58-67) and keep only the path addition torust_runtime.- If the sidecar should be tracked separately, remove
crates/sidecar/srcfrom therust_runtimepaths (line 51) and keep only thesidecar_runtimecomponent.Proposed fix: Track sidecar separately
- component_id: rust_runtime name: Rust Runtime paths: - "crates/core/src" - "crates/adaptive/src" - - "crates/sidecar/src" statuses: - type: project target: 95% threshold: 0.5% base: auto if_ci_failed: error - component_id: sidecar_runtime🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@codecov.yml` around lines 46 - 67, The coverage config duplicates the path "crates/sidecar/src" in component_id rust_runtime and sidecar_runtime; to track the sidecar separately, remove "crates/sidecar/src" from the paths array of the rust_runtime component (the entry under component_id rust_runtime) and leave the sidecar_runtime component (component_id sidecar_runtime) intact so only sidecar_runtime owns that path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/ci-path-filters.yml:
- Line 15: Remove the 'integrations/coding-agents/**' entry from the shared
section of the CI path filters: locate the 'shared' array and delete the
'integrations/coding-agents/**' line so documentation/config changes don't
trigger shared-component CI; leave the additions under the 'docs' and 'rust'
sections intact.
In `@crates/python/src/py_api/mod.rs`:
- Around line 644-650: The annotated-response decoding is happening before the
sanitized end-path and currently swallows codec errors with .ok(); move the
response codec decoding (py_llm_response_codec /
codec.decode_response(&response_json)) into the sanitized end-path (after
core_llm_api::llm_call_end or after you sanitize response_json) so annotations
are derived from the already-sanitized payload, and stop dropping
errors—propagate decode failures back to Python instead of using .ok() (e.g.,
return Err or ? from py_annotated_llm_response decoding) so callers see codec
decode errors.
In `@crates/sidecar/src/error.rs`:
- Around line 36-50: The match in into_response consumes self causing a
use-after-move when calling self.to_string(); fix by computing let message =
self.to_string() before the match and then match on &self (or match self by
reference) to get the StatusCode variants (StatusCode::BAD_REQUEST, BAD_GATEWAY,
INTERNAL_SERVER_ERROR) without moving; then build the body with
Json(json!({"error":{"message": message, "type":"nemo_flow_sidecar_error"}}))
and return the response.
In `@crates/sidecar/src/gateway.rs`:
- Around line 66-68: The current call axum::body::to_bytes(body, usize::MAX)
reads an unbounded request body; change it to use a fixed maximum (e.g., const
MAX_BODY_BYTES: usize = 10 * 1024 * 1024) and call axum::body::to_bytes(body,
MAX_BODY_BYTES) so body_bytes is limited, keeping the existing map_err to
SidecarError::InvalidPayload(error.to_string()) to translate limit/extract
errors; pick a limit in the 10–50 MiB range and name the constant (e.g.,
MAX_BODY_BYTES) so it’s easy to adjust.
In `@crates/sidecar/src/installer.rs`:
- Around line 289-296: The current use of
std::fs::read_to_string(...).unwrap_or_default() in the PlannedFile creation
(the call feeding merge_codex_config) silently treats any I/O error as an empty
config; change this to only treat NotFound as an empty string and return the
error for any other failure. Replace the unwrap_or_default behavior so that
before calling merge_codex_config you attempt to read the file and: if Ok(s) use
s, if Err(e) and e.kind() == std::io::ErrorKind::NotFound use String::new(),
otherwise propagate the error (convert to the function's error type) so the
installer fails closed; update the code paths around merge_codex_config and
PlannedFile accordingly.
In `@crates/sidecar/src/server.rs`:
- Around line 64-70: The shared HTTP client is created with Client::new() in
router (AppState.http) and lacks timeouts; replace it with a configured
reqwest::Client built via Client::builder() setting connect_timeout and overall
timeout (and optionally read timeout) before .build(), using values from
SidecarConfig (e.g., config.request_timeout or similar) so the shared client
does not hang indefinitely when proxying to upstream LLM providers; update the
router function to construct the client via
Client::builder().connect_timeout(...).timeout(...).build() and assign that to
AppState.http.
In `@crates/sidecar/src/session.rs`:
- Around line 147-154: apply_events is creating a new Session for any unknown
session_id (via sessions.entry(...).or_insert_with(|| Session::new(...))) which
causes terminal-only retries to recreate sessions after cleanup; change the
logic so you do not allocate a new Session when the incoming event is a
terminal/cleanup-only hook (i.e., detect terminal events via the event type or a
helper like event.is_terminal() or similar) — only insert Session::new when the
event is a genuine start/agent-scoped event, otherwise skip/continue and do not
call Session::new or session.apply; adjust the code paths around
sessions.entry/session.apply and any calls to ensure_agent_started so terminal
retries are ignored instead of recreating sessions.
- Around line 274-306: The LLM handle created in start_llm is never recorded in
the session's registry, so end_agent only draining self.llms can miss gateway
LLM events; after creating the ActiveLlm (the value constructed with stack,
handle, session_id, owner_subagent_id) persist it into self.llms (e.g.
insert/push keyed by session/owner or into the same collection end_agent drains)
and ensure the lifecycle is symmetric by removing it when the LLM completes;
update the start_llm function (and the other analogous LLM-start sites that call
llm_call/LlmCallParams) to register the ActiveLlm into self.llms immediately
after construction so end_agent will see and clean up in-flight gateway LLMs.
- Around line 1050-1060: The write_atif function uses an externally controlled
session_id directly in directory.join(format!("{session_id}.atif.json")),
enabling path traversal; sanitize/validate session_id in write_atif: ensure it
contains no path separators ('/' or '\\'), no components like "." or "..", not
empty, and contains only a safe whitelist of filename characters (e.g.,
alphanumerics, '-', '_'); reject or canonicalize any session_id that fails
validation, or fallback to an encoded/sanitized filename (e.g., percent-encode
or use sanitize_filename) before building path and writing; update write_atif to
perform this check/transform on session_id prior to calling directory.join to
guarantee the export cannot escape the configured directory.
- Around line 1147-1159: Filter OpenAI response items by their type before
treating them as tool calls: in the block that obtains response.get("output")
and iterates items (the code that calls push_tool_hint), check each item's
"type" field and only call push_tool_hint for tool-call types (e.g.,
"function_call" or whatever tool-specific type your OpenAI responses use),
mirroring the approach in collect_anthropic_tool_hints(); this prevents non-tool
messages that happen to have ids from generating hints and contaminating pending
hints.
In `@crates/sidecar/tests/coverage/gateway_tests.rs`:
- Around line 269-326: The test streaming_llm_guard_closes_on_drop has a race
because it sleeps 100ms after dropping StreamingLlmGuard::new and may not wait
for async cleanup; replace the fixed sleep with a robust wait: poll (with a
small sleep/backoff) for the expected side-effect (e.g., existence/contents of
temp.path().join("drop-session.atif.json")) for up to a sane timeout (e.g.,
several seconds) and only then call sessions.apply_events and assert; this
avoids changing production code and makes the test deterministic under CI load.
Include a short comment why polling with a timeout is used.
In `@crates/sidecar/tests/coverage/server_tests.rs`:
- Around line 364-370: The spawned axum server task (created with tokio::spawn
in the block using TcpListener::bind and axum::serve) is left running for the
test lifetime; change the spawn to capture the returned tokio::task::JoinHandle
(e.g., let handle = tokio::spawn(...)) and ensure it is aborted/awaited at test
end (or wrap the handle in a small RAII/drop type that calls handle.abort() on
drop) or alternatively use a cancellation mechanism (like a CancellationToken
passed into the axum service) and trigger it before the test exits so the
axum::serve task is cleanly shut down; locate the spawn of axum::serve and
replace it with a cancellable/abortable handle for proper shutdown.
In `@integrations/coding-agents/codex/hooks/hooks.json`:
- Line 1: Add the repo-approved SPDX metadata to the JSON manifest by inserting
the standard SPDX field for JSON configs at the top of hooks.json (e.g., a
top-level "spdx" or "spdxLicense" metadata key containing
"SPDX-License-Identifier": "Apache-2.0" per the project's JSON SPDX convention);
update the manifest in integrations/coding-agents/codex/hooks/hooks.json to
include that SPDX entry (or follow the documented exception path if JSON SPDX
headers are disallowed) so the file explicitly records the Apache-2.0 license.
In `@integrations/coding-agents/cursor/.cursor/hooks.json`:
- Line 1: Add the project SPDX metadata to the top-level of the new Cursor hook
JSON by inserting the project-approved SPDX field (e.g., a top-level "spdx" or
"SPDX-License-Identifier" key) with value "Apache-2.0" so the file
integrations/coding-agents/cursor/.cursor/hooks.json contains explicit SPDX
licensing metadata; ensure the key is the first entry in the JSON object so
tools will detect it and follow the repository's SPDX convention for JSON config
files.
In `@python/tests/test_builtin_codecs.py`:
- Around line 188-269: Add a negative-path test that calls llm.call_end with a
malformed annotation/codec input and asserts observability doesn't silently
degrade: register a subscriber (like in
test_manual_call_end_response_codec_attaches_annotation), invoke llm.call to get
a handle, then call llm.call_end with either an invalid annotated_response
(e.g., a non-dict value) or a response_codec/OpenAIChatCodec() plus a malformed
response payload, and assert that an error event is emitted (kind == "error" and
category == "llm") and/or the end scope event still appears but its
annotated_response is None; use the existing patterns
(subscribers.register/deregister, llm.call, llm.call_end) and name the test to
reflect the failure case.
---
Outside diff comments:
In `@codecov.yml`:
- Around line 46-67: The coverage config duplicates the path
"crates/sidecar/src" in component_id rust_runtime and sidecar_runtime; to track
the sidecar separately, remove "crates/sidecar/src" from the paths array of the
rust_runtime component (the entry under component_id rust_runtime) and leave the
sidecar_runtime component (component_id sidecar_runtime) intact so only
sidecar_runtime owns that path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: d3c184c5-8103-4b64-b847-abd89d5ffff8
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (59)
.github/ci-path-filters.yml.github/workflows/ci.yaml.github/workflows/ci_rust.yml.gitlab-ci.ymlATTRIBUTIONS-Rust.mdCargo.tomlREADME.mdRELEASING.mdcodecov.ymlcrates/core/src/observability/atif.rscrates/core/tests/unit/atif_tests.rscrates/python/src/py_api/mod.rscrates/python/tests/coverage/py_api_coverage_tests.rscrates/sidecar/Cargo.tomlcrates/sidecar/src/adapters/claude_code.rscrates/sidecar/src/adapters/codex.rscrates/sidecar/src/adapters/cursor.rscrates/sidecar/src/adapters/hermes.rscrates/sidecar/src/adapters/mod.rscrates/sidecar/src/config.rscrates/sidecar/src/error.rscrates/sidecar/src/gateway.rscrates/sidecar/src/installer.rscrates/sidecar/src/launcher.rscrates/sidecar/src/main.rscrates/sidecar/src/model.rscrates/sidecar/src/server.rscrates/sidecar/src/session.rscrates/sidecar/tests/cli_tests.rscrates/sidecar/tests/coverage/adapters_tests.rscrates/sidecar/tests/coverage/config_tests.rscrates/sidecar/tests/coverage/gateway_tests.rscrates/sidecar/tests/coverage/installer_tests.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/tests/coverage/server_tests.rscrates/sidecar/tests/coverage/session_tests.rsdocs/getting-started/installation.mddocs/getting-started/rust.mddocs/index.mddocs/integrate-frameworks/about.mddocs/integrate-frameworks/coding-agent-claude-code.mddocs/integrate-frameworks/coding-agent-codex.mddocs/integrate-frameworks/coding-agent-cursor.mddocs/integrate-frameworks/coding-agent-hermes.mddocs/integrate-frameworks/coding-agent-sidecar.mddocs/reference/api/rust/index.mdintegrations/coding-agents/README.mdintegrations/coding-agents/claude-code/.claude-plugin/plugin.jsonintegrations/coding-agents/claude-code/README.mdintegrations/coding-agents/claude-code/hooks/hooks.jsonintegrations/coding-agents/codex/.codex-plugin/plugin.jsonintegrations/coding-agents/codex/README.mdintegrations/coding-agents/codex/hooks/hooks.jsonintegrations/coding-agents/cursor/.cursor/hooks.jsonintegrations/coding-agents/cursor/README.mdjustfilepython/nemo_flow/_native.pyipython/nemo_flow/llm.pypython/tests/test_builtin_codecs.py
Signed-off-by: Will Killian <wkillian@nvidia.com>
Signed-off-by: Will Killian <wkillian@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/sidecar/src/gateway.rs`:
- Around line 503-558: The proxy must also strip any header names listed in the
Connection header; update response_headers to read the "connection" header from
the provided HeaderMap, parse its comma-separated tokens (trim and lowercase),
and skip any header whose name matches one of those tokens in addition to
existing is_hop_by_hop checks; similarly, ensure request-path filters
(should_forward_request_header and should_record_header) consider
Connection-listed names as hop-by-hop (you may pass the request's HeaderMap or a
precomputed set of connection tokens into these checks), and perform comparisons
case-insensitively when converting tokens to HeaderName before excluding them.
- Around line 537-542: The should_record_header function currently allows
sensitive headers like cookie and generic API-key aliases to be recorded; update
should_record_header (and keep its dependence on should_forward_request_header)
to also exclude http::header::COOKIE and to reject common API-key header names
in a case-insensitive way (e.g., "api-key", "apikey", and any other known
aliases you want blocked in addition to "x-api-key" and "anthropic-api-key") so
those values never appear in LlmRequest.headers or downstream exports.
In `@crates/sidecar/src/installer.rs`:
- Around line 655-663: The current write_planned_file function truncates the
target in place; change it to write atomically by creating a temp file in the
same directory (use file.path.parent()), write file.contents into that temp
file, flush and persist it, then std::fs::rename the temp over file.path; keep
the existing backup logic (backup_path(&file.path)?) before replacing, and on
success remove the backup if needed; also preserve permissions/metadata by
reading metadata from file.path (if it exists) and applying them to the temp
file before rename, and ensure the temp file is cleaned up on failure.
- Around line 372-440: The generated hook commands break on Windows because
shell_quote() uses POSIX single-quote escaping; update quoting to be
platform-aware: keep the current POSIX single-quote logic for non-Windows, and
on Windows implement proper double-quote escaping (wrap in "..." and escape
inner quotes and special cmd/powershell metacharacters) so arguments like
--sidecar-url, paths and JSON are usable in cmd.exe and PowerShell; change
shell_quote() accordingly and ensure hook_command (and helpers
push_optional_path/push_optional_gateway_mode) continue to call the new
shell_quote signature so all assembled args are quoted per-platform.
In `@crates/sidecar/src/launcher.rs`:
- Around line 474-479: The product name is mis-capitalized in the Codex provider
config string; update the display name inside codex_sidecar_provider_config by
changing "Nemo Flow OpenAI" to "NeMo Flow OpenAI" in the format string used to
build the provider entry (and search for other occurrences of the same literal
to keep naming consistent in user-facing config). Ensure the change is made in
the codex_sidecar_provider_config function's format(...) call that constructs
model_providers.nemo-flow-openai.
In `@crates/sidecar/src/session.rs`:
- Around line 175-190: The code currently falls back to a constant synthetic id
("gateway-gateway") when headers lack session_id and there isn't exactly one
hook session, causing shared state; change the fallback to generate a unique
per-request synthetic id (e.g., use a UUID or timestamp) when session_id is None
and single_active_session_id(&sessions) returns None, create the Session via
Session::new as now but mark it as synthetic (or track it locally), and after
calling session.start_llm(start).await ensure cleanup: if the session was
created as a synthetic gateway session and there is no hook-owned state left,
remove that entry from the sessions map (or schedule removal) once the ActiveLlm
ends so synthetic sessions don't persist and leak hints; key symbols: session_id
variable, single_active_session_id, Session::new, session.start_llm, and the
sessions (self.inner) map.
- Around line 147-157: The current guard "if event.is_terminal() &&
!sessions.contains_key(&session_id) { continue; }" drops terminal events like
LlmEnded and ToolEnded for new sessions so their post-only hooks never run;
change the condition to only skip teardown-only retry events instead of all
terminal events — e.g. narrow the check to skip only when the event is a
teardown-retry (or explicitly exclude LlmEnded and ToolEnded) so that terminal
Llm/Tool events still create a Session via Session::new(...) and reach
session.apply(...) (which triggers end_hook_llm / end_tool). Ensure you update
the conditional around event.is_terminal() and
sessions.contains_key(&session_id) to allow LlmEnded and ToolEnded to proceed.
In `@integrations/coding-agents/codex/hooks/hooks.json`:
- Around line 3-149: All hook entries currently invoke the nemo-flow-sidecar
binary directly (the "command" values like "nemo-flow-sidecar hook-forward
codex" used in hooks such as SessionStart, UserPromptSubmit, PreToolUse,
PostToolUse, PostToolUseFailure, AfterAgentResponse, AfterAgentThought,
SubagentStart/Stop, Notification, Stop, PreCompact, SessionEnd), which breaks
optionality when the binary is missing; change each hook to call a small
launcher/shim (e.g., nemo-flow-sidecar-launcher or nemo-flow-sidecar-shim) that
attempts to exec the real nemo-flow-sidecar but exits successfully (0) and
no-ops when the binary is not present so Codex hooks preserve native behavior.
Ensure every "command" string in the JSON is updated to the shim name and that
the shim implements the fail-open behavior.
In `@integrations/coding-agents/codex/README.md`:
- Around line 100-101: Update the user-facing provider name to use the correct
product capitalization: change the name value under the
[model_providers.nemo-flow-openai] section (currently "Nemo Flow OpenAI") to
"NeMo Flow OpenAI" so the README/config matches the product branding.
In `@integrations/coding-agents/cursor/.cursor/hooks.json`:
- Around line 3-174: The hooks call the nemo-flow-sidecar binary unconditionally
(every "command": "nemo-flow-sidecar hook-forward cursor"), causing failures
when the binary is missing; create a fail-open launcher (e.g.,
nemo-flow-sidecar-shim) that checks for the real nemo-flow-sidecar and execs it
if present, otherwise exits successfully (and optionally logs a debug message),
then update all hook entries to invoke the shim (replace "nemo-flow-sidecar
hook-forward cursor" with "nemo-flow-sidecar-shim hook-forward cursor") so the
integration remains optional while preserving original behavior when NeMo Flow
is absent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: c1565467-6e9c-42e1-8045-8e55cf54fca9
📒 Files selected for processing (24)
.github/ci-path-filters.ymlcodecov.ymlcrates/core/src/api/llm.rscrates/core/src/codec/traits.rscrates/python/src/py_api/mod.rscrates/sidecar/src/error.rscrates/sidecar/src/gateway.rscrates/sidecar/src/installer.rscrates/sidecar/src/launcher.rscrates/sidecar/src/model.rscrates/sidecar/src/server.rscrates/sidecar/src/session.rscrates/sidecar/tests/coverage/gateway_tests.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/tests/coverage/server_tests.rscrates/sidecar/tests/coverage/session_tests.rsdocs/integrate-frameworks/coding-agent-codex.mdintegrations/coding-agents/README.mdintegrations/coding-agents/codex/.codex-plugin/plugin.jsonintegrations/coding-agents/codex/README.mdintegrations/coding-agents/codex/hooks/hooks.jsonintegrations/coding-agents/cursor/.cursor/hooks.jsonpython/nemo_flow/llm.pypython/tests/test_builtin_codecs.py
📜 Review details
🧰 Additional context used
📓 Path-based instructions (45)
**/{integrations,integration,*-integration}/**
📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)
**/{integrations,integration,*-integration}/**: Keep NeMo Flow optional in framework integrations
Preserve the framework's original behavior when NeMo Flow is absent
Wrap tool and LLM paths at the correct framework boundary
Integration pattern must followdocs/integrate-frameworks/adding-scopes.md
Files:
integrations/coding-agents/codex/hooks/hooks.jsonintegrations/coding-agents/codex/README.mdintegrations/coding-agents/README.md
**/*.{rs,py,js,ts,go,sh,toml,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (AGENTS.md)
Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.
Files:
integrations/coding-agents/codex/hooks/hooks.jsonintegrations/coding-agents/codex/README.mdpython/nemo_flow/llm.pyintegrations/coding-agents/README.mdcrates/core/src/api/llm.rscrates/core/src/codec/traits.rscrates/sidecar/src/server.rsdocs/integrate-frameworks/coding-agent-codex.mdcodecov.ymlcrates/sidecar/src/model.rspython/tests/test_builtin_codecs.pycrates/python/src/py_api/mod.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/installer.rscrates/sidecar/tests/coverage/session_tests.rscrates/sidecar/src/launcher.rscrates/sidecar/src/error.rscrates/sidecar/tests/coverage/gateway_tests.rscrates/sidecar/tests/coverage/server_tests.rscrates/sidecar/src/gateway.rscrates/sidecar/src/session.rs
**/*.{md,rst,html,txt}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)
**/*.{md,rst,html,txt}: Always spellNVIDIAin all caps. Do not useNvidia,nvidia,nVidia,nVIDIA, orNV.
Usean NVIDIAbefore a noun because the name starts with an 'en' sound.
Do not add a registered trademark symbol afterNVIDIAwhen referring to the company.
Use trademark symbols with product names only when the document type or legal guidance requires them.
Verify official capitalization, spacing, and hyphenation for product names.
Precede NVIDIA product names withNVIDIAon first mention when it is natural and accurate.
Do not rewrite product names for grammar or title-case rules.
Preserve third-party product names according to the owner's spelling.
Include the company name and full model qualifier on first use when it helps identify the model.
Preserve the official capitalization and punctuation of model names.
Use shorter family names only after the full name is established.
Spell out a term on first use and put the acronym in parentheses unless the acronym is widely understood by the intended audience.
Use the acronym on later mentions after it has been defined.
For long documents, reintroduce the full term if readers might lose context.
Form plurals of acronyms withs, not an apostrophe, such asGPUs.
In headings, common acronyms can remain abbreviated. Spell out the term in the first or second sentence of the body.
Common terms such asCPU,GPU,PC,API, andUIusually do not need to be spelled out for developer audiences.
Files:
integrations/coding-agents/codex/README.mdintegrations/coding-agents/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,rst,html}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)
Link the first mention of a product name when the destination helps the reader.
Files:
integrations/coding-agents/codex/README.mdintegrations/coding-agents/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,rst,txt}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-guide.md)
**/*.{md,rst,txt}: SpellNVIDIAin all caps. Do not useNvidia,nvidia, orNV.
Format commands, code elements, expressions, package names, file names, and paths as inline code.
Use descriptive link text. Avoid raw URLs and weak anchors such as 'here' or 'read more.'
Use title case consistently for technical documentation headings.
Introduce code blocks, lists, tables, and images with complete sentences.
Write procedures as imperative steps. Keep steps parallel and split long procedures into smaller tasks.
Prefer active voice, present tense, short sentences, contractions, and plain English.
Usecanfor possibility and reservemayfor permission.
Useafterfor temporal relationships instead ofonce.
Preferrefer tooverseewhen the wording points readers to another resource.
Avoid culture-specific idioms, unnecessary Latinisms, jokes, and marketing exaggeration in technical documentation.
Spell out months in body text, avoid ordinal dates, and use clear time zones.
Spell out whole numbers from zero through nine unless they are technical values, parameters, versions, or UI values.
Use numerals for 10 or greater and include commas in thousands.
Do not add trademark symbols to learning-oriented documentation unless the source, platform, or legal guidance explicitly requires them.
Do not add trademark symbols to NeMo Flow learning documentation by default.
Do not rewrite API names, package names, command flags, or code literals for style reasons.
Files:
integrations/coding-agents/codex/README.mdintegrations/coding-agents/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,markdown,rst}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-technical-docs.md)
**/*.{md,markdown,rst}: Use title case consistently in technical documentation headings
Avoid quotation marks, ampersands, and exclamation marks in headings
Keep product, event, research, and whitepaper names in their official title case
Use title case for table headers
Do not force social-media sentence case into technical docs
Use monospace formatting for code elements, commands, parameters, package names, and expressions
Use monospace formatting for directories, file names, and paths
Use angle brackets inside monospace for variables inside paths, such as/home/<username>/.login
Use quotation marks for error messages and strings in documentation
Use bold formatting for UI buttons, menus, fields, and labels in documentation
Use angle brackets between UI labels for menu paths, such as File > Save As
Use italics for new terms on first use in documentation
Use italics for publication titles in documentation
Use plain text formatting for keyboard shortcuts in documentation
Prefer[NVIDIA/NeMo](link)format for GitHub repository references over generic phrases like 'the GitHub repo'
Introduce every code block with a complete sentence
Do not make a code block complete the grammar of the previous sentence
Do not continue a sentence after a code block
Use syntax highlighting when the format supports it for code blocks
Avoid the word 'snippet' unless the surrounding docs already use it as a term of art
Keep inline method, function, and class references consistent with nearby docs, omitting empty parentheses for prose readability when no call is shown
Use descriptive anchor text that matches the destination title when possible for links
Avoid raw URLs in running text in documentation
Avoid generic link anchors such as 'here,' 'this page,' and 'read more' in documentation
Include the acronym in link text if a linked term includes an acronym
Do not link long sentences or multiple sentences in documentation
Avoid links that pull readers away from a procedure unles...
Files:
integrations/coding-agents/codex/README.mdintegrations/coding-agents/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{html,md}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Include SPDX license headers in HTML and Markdown files using HTML comment syntax
Files:
integrations/coding-agents/codex/README.mdintegrations/coding-agents/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/README.md
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Update relevant crate or package README when that surface changed
Relevant package or crate README.md files must be updated when examples or binding guidance changes
Files:
integrations/coding-agents/codex/README.mdintegrations/coding-agents/README.md
**/*.md
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Run Markdown link checking via
lycheeforREADME.md,CONTRIBUTING.md, anddocs/through pre-commit hooks
Files:
integrations/coding-agents/codex/README.mdintegrations/coding-agents/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,markdown,py,sh,bash,js,ts,java,cpp,go,rust}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Keep package names, repo references, and build commands current in documentation
Files:
integrations/coding-agents/codex/README.mdpython/nemo_flow/llm.pyintegrations/coding-agents/README.mddocs/integrate-frameworks/coding-agent-codex.mdpython/tests/test_builtin_codecs.py
**/*.{md,markdown,py,sh,bash}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Keep stable user-facing wrappers at scripts/ root in docs and examples; only point at namespaced helper paths when documenting internal maintenance work
Files:
integrations/coding-agents/codex/README.mdpython/nemo_flow/llm.pyintegrations/coding-agents/README.mddocs/integrate-frameworks/coding-agent-codex.mdpython/tests/test_builtin_codecs.py
**/*.{md,markdown,py,sh,bash,js,ts,example}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Example commands must match current package names and paths
Files:
integrations/coding-agents/codex/README.mdpython/nemo_flow/llm.pyintegrations/coding-agents/README.mddocs/integrate-frameworks/coding-agent-codex.mdpython/tests/test_builtin_codecs.py
**/*.{md,txt,rst}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)
**/*.{md,txt,rst}: Ensure commands, package names, file paths, and APIs in documentation are correct and not stale; flag incorrect or outdated information as blocking issues
Ensure examples and procedures in documentation will execute successfully with current APIs and commands
Use consistent user-facing terminology throughout documentation that matches current repo terminology
Capitalize NVIDIA correctly in all documentation and public-facing text
Format code, commands, paths, and filenames as inline code (monospace) in documentation
Use descriptive anchor text for links instead of bare URLs or weak labels like 'here' in documentation
Prefer active voice, present tense, short sentences, and plain English in documentation
Structure documentation procedures as imperative steps that are easy to scan and not too long for a single sequence
Prefer 'after' instead of 'once' for temporal references in documentation
Use 'can' instead of 'may' when describing possibility (rather than permission) in documentation
Avoid ambiguous numeric dates and ordinal dates in documentation body text
Files:
integrations/coding-agents/codex/README.mdintegrations/coding-agents/README.mddocs/integrate-frameworks/coding-agent-codex.md
{README.md,docs/index.md,**/README.md}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)
Update entry-point documentation (README.md, docs/index.md, package READMEs, binding-level source READMEs) whenever public behavior changes
Files:
integrations/coding-agents/codex/README.mdintegrations/coding-agents/README.md
**/*.{py,js,ts,tsx,go,rs,md}
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
Format changed files with the language-native formatter before the final lint/test pass
Files:
integrations/coding-agents/codex/README.mdpython/nemo_flow/llm.pyintegrations/coding-agents/README.mdcrates/core/src/api/llm.rscrates/core/src/codec/traits.rscrates/sidecar/src/server.rsdocs/integrate-frameworks/coding-agent-codex.mdcrates/sidecar/src/model.rspython/tests/test_builtin_codecs.pycrates/python/src/py_api/mod.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/installer.rscrates/sidecar/tests/coverage/session_tests.rscrates/sidecar/src/launcher.rscrates/sidecar/src/error.rscrates/sidecar/tests/coverage/gateway_tests.rscrates/sidecar/tests/coverage/server_tests.rscrates/sidecar/src/gateway.rscrates/sidecar/src/session.rs
**/*.py
📄 CodeRabbit inference engine (.agents/skills/test-python-binding/SKILL.md)
Format changed Python wrapper and test files with
uv run ruff format python
**/*.py: Use Ruff with rule sets E, F, W, I for Python linting
Use Ruff formatter for Python code with line length 120 and double quotes
Usetyfor Python type checking
Use snake_case naming convention for Python code
Include SPDX license headers in all Python source files using Python comment syntax
Files:
python/nemo_flow/llm.pypython/tests/test_builtin_codecs.py
**/*.{rs,py,go,js,ts,tsx}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Use
SONAR_IGNORE_START/SONAR_IGNORE_ENDmarkers only for documented false positives that cannot be resolved in code; keep ignored blocks small, add explanatory comments, and require reviewer sign-off
Files:
python/nemo_flow/llm.pycrates/core/src/api/llm.rscrates/core/src/codec/traits.rscrates/sidecar/src/server.rscrates/sidecar/src/model.rspython/tests/test_builtin_codecs.pycrates/python/src/py_api/mod.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/installer.rscrates/sidecar/tests/coverage/session_tests.rscrates/sidecar/src/launcher.rscrates/sidecar/src/error.rscrates/sidecar/tests/coverage/gateway_tests.rscrates/sidecar/tests/coverage/server_tests.rscrates/sidecar/src/gateway.rscrates/sidecar/src/session.rs
python/nemo_flow/**/*.py
📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)
Use
snake_casenaming convention for Python API implementations and wrappers
Files:
python/nemo_flow/llm.py
{docs/**,examples/**,crates/adaptive/**,python/nemo_flow/**,go/nemo_flow/**,**/{example,component}.{ts,tsx,js,rs,py,go}}
📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)
Any new adaptive component kind must have documentation, examples, and binding coverage across all supported languages
Files:
python/nemo_flow/llm.pydocs/integrate-frameworks/coding-agent-codex.md
**/*.{js,ts,tsx,jsx,py,rs,go,java,c,cpp,h,cc,cxx,cs,rb,php,swift,kt}
📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)
Changed files must be formatted with the language-native formatter
Files:
python/nemo_flow/llm.pycrates/core/src/api/llm.rscrates/core/src/codec/traits.rscrates/sidecar/src/server.rscrates/sidecar/src/model.rspython/tests/test_builtin_codecs.pycrates/python/src/py_api/mod.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/installer.rscrates/sidecar/tests/coverage/session_tests.rscrates/sidecar/src/launcher.rscrates/sidecar/src/error.rscrates/sidecar/tests/coverage/gateway_tests.rscrates/sidecar/tests/coverage/server_tests.rscrates/sidecar/src/gateway.rscrates/sidecar/src/session.rs
python/**/*.py
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
python/**/*.py: If Python language surface changed, always run Python test target even when Rust core did not change
For Python changes, useuv run ruff format pythonto format Python files
Files:
python/nemo_flow/llm.pypython/tests/test_builtin_codecs.py
**/*.{rs,py,js,ts,tsx,go}
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
During iteration, prefer
uv run pre-commit run --files <changed files...>for targeted validation
Files:
python/nemo_flow/llm.pycrates/core/src/api/llm.rscrates/core/src/codec/traits.rscrates/sidecar/src/server.rscrates/sidecar/src/model.rspython/tests/test_builtin_codecs.pycrates/python/src/py_api/mod.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/installer.rscrates/sidecar/tests/coverage/session_tests.rscrates/sidecar/src/launcher.rscrates/sidecar/src/error.rscrates/sidecar/tests/coverage/gateway_tests.rscrates/sidecar/tests/coverage/server_tests.rscrates/sidecar/src/gateway.rscrates/sidecar/src/session.rs
python/nemo_flow/**/*
⚙️ CodeRabbit configuration file
python/nemo_flow/**/*: Review Python wrapper changes for typed API consistency, contextvars-based scope isolation, async behavior, and parity with the native extension.
Stubs and runtime implementations should stay aligned.
Files:
python/nemo_flow/llm.py
{.github/**/*.{yml,yaml},*.patch,scripts/**/*,*.sh,*.bat,Dockerfile*}
📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)
Update CI configuration, patch files, and build scripts with new functional identifiers after rename operations
Files:
.github/ci-path-filters.yml
{.github/**,.gitlab-ci.yml,.pre-commit-config.yaml,justfile,scripts/**}
⚙️ CodeRabbit configuration file
{.github/**,.gitlab-ci.yml,.pre-commit-config.yaml,justfile,scripts/**}: Review automation changes for reproducibility, pinned versions where appropriate, secret handling, and consistency with the documented validation matrix.
Pay attention to commands that need generated native artifacts, FFI libraries, or platform-specific environment variables.
Files:
.github/ci-path-filters.yml
**/*.rs
📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)
**/*.rs: Runcargo fmt --allfor FFI work as it is Rust work
Runjust test-rustfor FFI validation
Runcargo clippy --workspace --all-targets -- -D warningsto enforce warnings-as-errors lintingWhen Rust files changed as part of Python work, run
cargo fmt --all,just test-rust, andcargo clippy --workspace --all-targets -- -D warnings
**/*.rs: Runcargo fmt --allfor Rust code formatting
Runcargo clippy --workspace --all-targets -- -D warningsto enforce Rust linting with no warnings
Runjust test-rustas the shared-runtime build/test wrapper for Rust changesUse Rust snake_case naming convention for Rust code
**/*.rs: Any Rust change must runjust test-rust
Any Rust change must runcargo fmt --all
Any Rust change must runcargo clippy --workspace --all-targets -- -D warnings
**/*.rs: Runcargo fmt --allfor Rust code formatting when Node changes touch Rust files
Runcargo clippy --workspace --all-targets -- -D warningsto enforce strict linting when Rust files changed as part of Node work
**/*.rs: Always runjust test-rustwhen any Rust code changes
Always runcargo fmt --allwhen any Rust code changes
Always runcargo clippy --workspace --all-targets -- -D warningswhen any Rust code changes
Files:
crates/core/src/api/llm.rscrates/core/src/codec/traits.rscrates/sidecar/src/server.rscrates/sidecar/src/model.rscrates/python/src/py_api/mod.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/installer.rscrates/sidecar/tests/coverage/session_tests.rscrates/sidecar/src/launcher.rscrates/sidecar/src/error.rscrates/sidecar/tests/coverage/gateway_tests.rscrates/sidecar/tests/coverage/server_tests.rscrates/sidecar/src/gateway.rscrates/sidecar/src/session.rs
**/*.{rs,go,js,ts,tsx}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Include SPDX license headers in all Rust, Go, JavaScript, and TypeScript source files using C-style comment syntax
Files:
crates/core/src/api/llm.rscrates/core/src/codec/traits.rscrates/sidecar/src/server.rscrates/sidecar/src/model.rscrates/python/src/py_api/mod.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/installer.rscrates/sidecar/tests/coverage/session_tests.rscrates/sidecar/src/launcher.rscrates/sidecar/src/error.rscrates/sidecar/tests/coverage/gateway_tests.rscrates/sidecar/tests/coverage/server_tests.rscrates/sidecar/src/gateway.rscrates/sidecar/src/session.rs
crates/core/src/api/**/*.rs
📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)
crates/core/src/api/**/*.rs: Implement public API changes first incrates/core/src/api/and related core modules such ascrates/core/src/api/runtime/,crates/core/src/codec/, orcrates/core/src/json.rs
Usesnake_casenaming convention for Rust core API implementations
Files:
crates/core/src/api/llm.rs
{crates/core/src/api/tool.rs,crates/core/src/api/llm.rs}
📄 CodeRabbit inference engine (.agents/skills/add-middleware/SKILL.md)
Wire middleware chain execution into the appropriate pipeline stage in
crates/core/src/api/tool.rsorcrates/core/src/api/llm.rsby calling the chain method at the correct execution point
Files:
crates/core/src/api/llm.rs
crates/{core,adaptive}/**
📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)
Changes to
crates/coreorcrates/adaptivemust run the full language matrix
Files:
crates/core/src/api/llm.rscrates/core/src/codec/traits.rs
{crates/core,crates/adaptive}/**/*.rs
📄 CodeRabbit inference engine (.agents/skills/test-wasm-binding/SKILL.md)
If the change touched shared runtime semantics in
crates/coreorcrates/adaptive, also usevalidate-change
Files:
crates/core/src/api/llm.rscrates/core/src/codec/traits.rs
crates/core/**/*.rs
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
When
crates/corechanges, run the full validation matrix across Rust, Python, Go, Node.js, and WebAssembly
Files:
crates/core/src/api/llm.rscrates/core/src/codec/traits.rs
crates/{core,adaptive}/**/*.rs
⚙️ CodeRabbit configuration file
crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.
Files:
crates/core/src/api/llm.rscrates/core/src/codec/traits.rs
docs/**/*.md
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Run
./scripts/build-docs.shfor documentation site changes
docs/**/*.md: Relevant getting-started or reference docs must be updated when examples change
Release-policy docs must point to GitHub Releases as the only release-history source of truth
docs/**/*.md: Use title case for headings in technical documentation
Introduce code blocks, tables, and lists with complete lead-in sentences in documentation
Files:
docs/integrate-frameworks/coding-agent-codex.md
{RELEASING.md,CHANGELOG.md,docs/**/*.md}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Keep release-process and release-notes guidance in repo-maintainer docs such as RELEASING.md, not as user-facing docs pages or CHANGELOG.md
Files:
docs/integrate-frameworks/coding-agent-codex.md
docs/integrate-frameworks/**/*.md
📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)
Documentation must be updated if activation or usage of the integration changed
Files:
docs/integrate-frameworks/coding-agent-codex.md
{scripts/*.sh,docs/**/*.md}
📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)
Use root
./scripts/*.shcommands in docs and contributor guidance as documented, with implementations underscripts/third-party/
Files:
docs/integrate-frameworks/coding-agent-codex.md
{README*,CHANGELOG*,docs/**/*.{md,rst,txt},examples/**/*,*.md}
📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)
Update documentation, examples, and getting-started guides with new package/module/crate names after rename operations
Files:
docs/integrate-frameworks/coding-agent-codex.md
{README.md,docs/**/*.md,examples/**/*.{js,ts,py,go,rs}}
📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)
Keep documentation and examples synchronized with current install, import, and build commands
Files:
docs/integrate-frameworks/coding-agent-codex.md
{README.md,CONTRIBUTING.md,docs/**/*.md}
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
For docs-only changes, run targeted checks only if commands, package names, or examples changed. Use
just docsfor docs-site builds andjust docs-linkcheckwhen links changed
Files:
docs/integrate-frameworks/coding-agent-codex.md
{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}
⚙️ CodeRabbit configuration file
{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}: Review documentation for technical accuracy against the current API, command correctness, and consistency across language bindings.
Flag stale examples, missing SPDX headers where required, and instructions that no longer match CI or pre-commit behavior.
Files:
docs/integrate-frameworks/coding-agent-codex.md
**/test_*.py
📄 CodeRabbit inference engine (.agents/skills/test-python-binding/SKILL.md)
Run focused
pytestfirst when you know the affected area usingpytest -k "<pattern>"
Files:
python/tests/test_builtin_codecs.py
{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}
⚙️ CodeRabbit configuration file
{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.
Files:
python/tests/test_builtin_codecs.pycrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/tests/coverage/session_tests.rscrates/sidecar/tests/coverage/gateway_tests.rscrates/sidecar/tests/coverage/server_tests.rs
crates/python/**/*.rs
📄 CodeRabbit inference engine (.agents/skills/test-python-binding/SKILL.md)
When the native Rust bridge changed, add the Rust crate tests for
nemo-flow-python
Files:
crates/python/src/py_api/mod.rs
crates/{python,ffi,node,wasm}/**/*
⚙️ CodeRabbit configuration file
crates/{python,ffi,node,wasm}/**/*: Treat binding changes as public API changes. Check for parity with the other language bindings, FFI ownership/lifetime safety,
callback error propagation, stable type conversion, and consistent async/stream semantics.
Flag changes that update one binding without corresponding tests or documentation for the same surface elsewhere.
Files:
crates/python/src/py_api/mod.rs
🪛 LanguageTool
integrations/coding-agents/README.md
[style] ~132-~132: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ay-mode hook-only|passthrough|required` records the expected gateway behavior in sess...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~134-~134: To form a complete sentence, be sure to include a subject.
Context: ... in session metadata. - --fail-closed can be added to generated hook commands whe...
(MISSING_IT_THERE)
🪛 Ruff (0.15.12)
python/tests/test_builtin_codecs.py
[warning] 194-194: Missing return type annotation for private function capture
Add return type annotation: None
(ANN202)
[warning] 241-241: Missing return type annotation for private function capture
Add return type annotation: None
(ANN202)
[warning] 277-277: Missing return type annotation for private function capture
Add return type annotation: None
(ANN202)
[warning] 280-280: Missing return type annotation for private function sanitize_response
(ANN202)
[warning] 280-280: Unused function argument: response
(ARG001)
[warning] 322-322: Missing return type annotation for private function capture
Add return type annotation: None
(ANN202)
🔇 Additional comments (21)
codecov.yml (2)
57-66: Sidecar runtime coverage component is correctly scoped and gated.This matches the existing component policy shape and sets a clear quality gate for
crates/sidecar/src.
125-125: Explicit sidecar test exclusion is fine for component-focused runtime coverage.This keeps component coverage focused on production sources under
crates/sidecar/src.crates/python/src/py_api/mod.rs (2)
632-659: Past review concern resolved.The implementation now delegates
response_codectocore_llm_api::llm_call_end, which performs decoding from the sanitized end-event payload. Decode failures are propagated back to Python asRuntimeErrorafter the end event is emitted.
51-98: LGTM!Helper functions
py_llm_response_codecandpy_annotated_llm_responsecleanly handle:
- Fast-path extraction for built-in codecs (avoiding Python dispatch overhead).
- Fallback wrapper for custom Python codecs.
- JSON deserialization with proper error propagation.
python/tests/test_builtin_codecs.py (3)
318-342: Good error-path coverage.This test validates the documented behavior that codec decode failures:
- Raise
RuntimeErrorto the caller.- Still emit exactly one end event.
- Set
annotated_responsetoNoneon the emitted event.This addresses the previous review request for failure-path assertions.
273-316: LGTM!Validates that
response_codecdecoding operates on the sanitized payload rather than the raw input. The unusedresponseparameter insanitize_responseis required by the guardrail callback signature.
190-271: LGTM!Tests properly validate:
response_codecproduces decoded annotation with usage/response_text.- Explicit
annotated_responsemapping is accepted and preserved.Both use proper test isolation with subscriber registration/deregistration.
crates/core/src/codec/traits.rs (1)
65-77: LGTM!Documentation accurately describes the fallible decode semantics and aligns with the
llm_call_endimplementation that emits the end event even when decoding fails.crates/core/src/api/llm.rs (1)
388-449: LGTM!The implementation correctly:
- Sanitizes the response before codec decoding (line 413).
- Derives annotation from the sanitized payload when
annotated_responseis not provided (lines 419-431).- Emits the end event regardless of decode outcome.
- Returns decode errors after event emission for caller visibility.
This addresses the previous review concern about pre-sanitized annotation content.
python/nemo_flow/llm.py (1)
134-180: LGTM!The
call_end()API extension:
- Properly types
annotated_responseas union ofAnnotatedLLMResponse | Mapping[str, Json] | None.- Documents that codec failures raise after event emission.
- Cleanly forwards to native implementation.
integrations/coding-agents/README.md (1)
1-167: LGTM!Documentation is well-structured with proper SPDX header, consistent heading case, code block formatting, and clear procedural guidance. The static analysis hints about successive sentences are false positives for a CLI option list where each item naturally starts with a backtick-quoted flag.
integrations/coding-agents/codex/.codex-plugin/plugin.json (1)
1-32: LGTM!Plugin manifest is well-formed with correct metadata, license, repository references, and hook wiring path.
crates/sidecar/src/server.rs (1)
70-98: LGTM!HTTP client now properly configures connect, request, and read timeouts. Route setup correctly handles both prefixed (
/v1/...) and non-prefixed paths for OpenAI compatibility. Theexpect()on the builder is acceptable since the configuration is compile-time static.crates/sidecar/tests/coverage/server_tests.rs (2)
18-33: LGTM!
TestServernow properly implements RAII cleanup viaDrop::dropthat aborts the spawned task, addressing the prior review concern about leaked server tasks.
47-471: Comprehensive test coverage.Tests cover all hook endpoints (Codex, Claude Code, Cursor, Hermes), gateway passthrough behavior (JSON forwarding, streaming preservation, header handling), error cases (unreachable upstream, unsupported paths), and models endpoint forwarding.
crates/sidecar/tests/coverage/launcher_tests.rs (2)
8-11: Good use of OnceLock for test serialization.The
current_dir_lock()pattern correctly serializes tests that mutate the process-global current directory, preventing races between concurrent test execution.
499-524: Integration test properly validates sidecar URL injection.The
run_starts_sidecar_injects_env_and_returns_agent_exit_codetest correctly verifies that the dynamic sidecar URL is injected into the child process environment and that exit codes are preserved.crates/sidecar/tests/coverage/gateway_tests.rs (2)
321-389: Streaming guard test now uses robust polling.The test correctly uses
wait_for_session_llms_emptywith a 5-second timeout instead of a fixed 100ms sleep, addressing the prior review concern about flaky failures on loaded CI systems.
14-31: Polling helper addresses prior race condition with timeout-based retries.The
wait_for_file_containsandwait_for_session_llms_emptyhelpers replace fixed sleeps with timeout-based polling (25ms intervals, 5s timeout), making the streaming guard drop test robust under CI load.The let-chains syntax in lines 21–23 is compatible with the project's Rust edition 2024 and requires no feature flags.
crates/sidecar/src/launcher.rs (2)
29-120: Well-structured transparent run orchestration.The separation of
TransparentRun::new(config resolution, binding),print_if_requested(dry-run inspection), andexecute(live execution) provides clean separation of concerns. Theexecute_live_runfunction correctly orders: start server → wait health → spawn child → restore state → stop server, preserving child exit status.
458-472: Health polling is reasonable but consider timeout alignment.The health check polls 50 times with 20ms sleeps (~1 second total). The
Client::new()here has no explicit timeout, relying on system defaults. For consistency with the server's configured timeouts, consider using a short connect timeout on this health-check client.This is minor since health checks are typically fast and the retry loop provides implicit timeout behavior.
| fn response_headers(headers: &HeaderMap) -> HeaderMap { | ||
| let mut output = HeaderMap::new(); | ||
| for (name, value) in headers { | ||
| if !is_hop_by_hop(name) { | ||
| output.append(name.clone(), value.clone()); | ||
| } | ||
| } | ||
| output | ||
| } | ||
|
|
||
| // Reconstructs an Axum response from upstream status, filtered headers, and the selected body. All | ||
| // builder errors are converted into sidecar HTTP errors rather than panics. | ||
| fn build_response( | ||
| status: StatusCode, | ||
| headers: HeaderMap, | ||
| body: Body, | ||
| ) -> Result<Response<Body>, SidecarError> { | ||
| let mut builder = Response::builder().status(status); | ||
| for (name, value) in &headers { | ||
| builder = builder.header(name, value); | ||
| } | ||
| Ok(builder.body(body)?) | ||
| } | ||
|
|
||
| // Allows provider request headers through unless they are transport-owned or must be recalculated | ||
| // for the forwarded body. Host and content length are intentionally excluded because reqwest sets | ||
| // them for the upstream connection. | ||
| fn should_forward_request_header(name: &HeaderName) -> bool { | ||
| !is_hop_by_hop(name) && name != http::header::HOST && name != http::header::CONTENT_LENGTH | ||
| } | ||
|
|
||
| // Allows headers into observability metadata only after removing credentials and provider API keys. | ||
| // The forwarding filter runs first so hop-by-hop transport headers are also excluded from recorded | ||
| // LLM request attributes. | ||
| fn should_record_header(name: &HeaderName) -> bool { | ||
| should_forward_request_header(name) | ||
| && name != http::header::AUTHORIZATION | ||
| && name.as_str() != "x-api-key" | ||
| && name.as_str() != "anthropic-api-key" | ||
| } | ||
|
|
||
| // Identifies headers that describe a single transport hop and therefore must not be proxied across | ||
| // the client-sidecar-upstream boundary. | ||
| fn is_hop_by_hop(name: &HeaderName) -> bool { | ||
| matches!( | ||
| name.as_str(), | ||
| "connection" | ||
| | "keep-alive" | ||
| | "proxy-authenticate" | ||
| | "proxy-authorization" | ||
| | "te" | ||
| | "trailer" | ||
| | "transfer-encoding" | ||
| | "upgrade" | ||
| ) | ||
| } |
There was a problem hiding this comment.
Strip headers named by Connection too.
The proxy only removes the fixed hop-by-hop names. If either side sends Connection: x-foo, x-foo is still forwarded here, which violates proxy semantics and can leak per-connection metadata or break upstream behavior. Parse the Connection tokens from the current header map and exclude those names in both request and response paths.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sidecar/src/gateway.rs` around lines 503 - 558, The proxy must also
strip any header names listed in the Connection header; update response_headers
to read the "connection" header from the provided HeaderMap, parse its
comma-separated tokens (trim and lowercase), and skip any header whose name
matches one of those tokens in addition to existing is_hop_by_hop checks;
similarly, ensure request-path filters (should_forward_request_header and
should_record_header) consider Connection-listed names as hop-by-hop (you may
pass the request's HeaderMap or a precomputed set of connection tokens into
these checks), and perform comparisons case-insensitively when converting tokens
to HeaderName before excluding them.
| fn should_record_header(name: &HeaderName) -> bool { | ||
| should_forward_request_header(name) | ||
| && name != http::header::AUTHORIZATION | ||
| && name.as_str() != "x-api-key" | ||
| && name.as_str() != "anthropic-api-key" | ||
| } |
There was a problem hiding this comment.
Do not record cookie or generic API-key aliases.
should_record_header still allows secrets like cookie and api-key, so those values can end up in LlmRequest.headers and downstream exports. They should be filtered alongside Authorization, x-api-key, and anthropic-api-key.
Proposed fix
fn should_record_header(name: &HeaderName) -> bool {
should_forward_request_header(name)
&& name != http::header::AUTHORIZATION
+ && name != http::header::COOKIE
&& name.as_str() != "x-api-key"
+ && name.as_str() != "api-key"
&& name.as_str() != "anthropic-api-key"
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn should_record_header(name: &HeaderName) -> bool { | |
| should_forward_request_header(name) | |
| && name != http::header::AUTHORIZATION | |
| && name.as_str() != "x-api-key" | |
| && name.as_str() != "anthropic-api-key" | |
| } | |
| fn should_record_header(name: &HeaderName) -> bool { | |
| should_forward_request_header(name) | |
| && name != http::header::AUTHORIZATION | |
| && name != http::header::COOKIE | |
| && name.as_str() != "x-api-key" | |
| && name.as_str() != "api-key" | |
| && name.as_str() != "anthropic-api-key" | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sidecar/src/gateway.rs` around lines 537 - 542, The
should_record_header function currently allows sensitive headers like cookie and
generic API-key aliases to be recorded; update should_record_header (and keep
its dependence on should_forward_request_header) to also exclude
http::header::COOKIE and to reject common API-key header names in a
case-insensitive way (e.g., "api-key", "apikey", and any other known aliases you
want blocked in addition to "x-api-key" and "anthropic-api-key") so those values
never appear in LlmRequest.headers or downstream exports.
| fn hook_command(command: &InstallCommand, agent: CodingAgent) -> String { | ||
| let mut args = vec![ | ||
| "nemo-flow-sidecar".to_string(), | ||
| "hook-forward".to_string(), | ||
| agent.as_arg().to_string(), | ||
| "--sidecar-url".to_string(), | ||
| command.sidecar_url.clone(), | ||
| ]; | ||
| push_optional_path(&mut args, "--atif-dir", command.atif_dir.as_deref()); | ||
| push_optional( | ||
| &mut args, | ||
| "--openinference-endpoint", | ||
| command.openinference_endpoint.as_deref(), | ||
| ); | ||
| push_optional(&mut args, "--profile", command.profile.as_deref()); | ||
| push_optional( | ||
| &mut args, | ||
| "--session-metadata", | ||
| command.session_metadata.as_deref(), | ||
| ); | ||
| push_optional( | ||
| &mut args, | ||
| "--plugin-config", | ||
| command.plugin_config.as_deref(), | ||
| ); | ||
| push_optional_gateway_mode(&mut args, command.gateway_mode); | ||
| args.into_iter() | ||
| .map(|arg| shell_quote(&arg)) | ||
| .collect::<Vec<_>>() | ||
| .join(" ") | ||
| } | ||
|
|
||
| // Appends a flag/value pair only when a string option is present, preserving omission semantics in | ||
| // generated hook commands instead of serializing empty values. | ||
| fn push_optional(args: &mut Vec<String>, flag: &str, value: Option<&str>) { | ||
| if let Some(value) = value { | ||
| args.push(flag.to_string()); | ||
| args.push(value.to_string()); | ||
| } | ||
| } | ||
|
|
||
| // Appends optional path flags using display formatting because installed commands are read by a | ||
| // shell, not by Rust path parsers. | ||
| fn push_optional_path(args: &mut Vec<String>, flag: &str, value: Option<&Path>) { | ||
| if let Some(value) = value { | ||
| args.push(flag.to_string()); | ||
| args.push(value.display().to_string()); | ||
| } | ||
| } | ||
|
|
||
| // Serializes the gateway-mode enum into the generated hook-forward command only when explicitly | ||
| // configured, leaving default runtime behavior under the sidecar's normal config resolution. | ||
| fn push_optional_gateway_mode(args: &mut Vec<String>, gateway_mode: Option<GatewayMode>) { | ||
| if let Some(gateway_mode) = gateway_mode { | ||
| args.push("--gateway-mode".to_string()); | ||
| args.push(gateway_mode.as_arg().to_string()); | ||
| } | ||
| } | ||
|
|
||
| // Quotes a shell argument only when necessary. The safe character set is intentionally small so | ||
| // paths and URLs remain readable while whitespace, quotes, and shell metacharacters are protected. | ||
| fn shell_quote(value: &str) -> String { | ||
| if value | ||
| .chars() | ||
| .all(|character| character.is_ascii_alphanumeric() || "-_./:=,".contains(character)) | ||
| { | ||
| value.to_string() | ||
| } else { | ||
| format!("'{}'", value.replace('\'', "'\\''")) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For Claude Code, Codex CLI, Cursor, and Hermes on Windows: are hook commands executed through a POSIX shell, cmd.exe, or PowerShell, and what quoting/escaping syntax is required for arguments containing spaces or quotes?
💡 Result:
Claude Code hooks on Windows: Executed through POSIX shell (bash.exe by default, spawning Git Bash), or PowerShell if specified via "shell": "powershell" field. [1][2] Default bash quoting/escaping: single quotes for strings with spaces, escape internal single quotes with ''' or close/reopen. PowerShell example uses powershell.exe -Command "..." with internal single quotes. [3] Codex CLI on Windows: Primarily PowerShell (native support, sandbox), but issues show cmd.exe /c wrappers and bat shims causing quoting issues for spaces/multiline; use PowerShell quoting (double outer, single inner) or CMD escaping (^" or doubled "). [4][5][6][7] Cursor hooks on Windows: Command hooks run as shell scripts (typically bash via Git Bash or cmd/PowerShell via wrappers like run-hook.cmd to avoid file-open issues); issues report PowerShell injection with Git Bash terminal and execution failures without.cmd wrappers. Bash/POSIX quoting recommended, or explicit.cmd for cmd.exe/PowerShell. [8][9][10] Hermes Agent on Windows: Native Windows not supported; requires WSL2 (POSIX shell). No hook commands mentioned. [11][12]
Citations:
- 1: [Windows] Statusline and hook commands spawn bash.exe without CREATE_NO_WINDOW, causing visible console flash every ~10 seconds anthropics/claude-code#51867
- 2: https://code.claude.com/docs/en/hooks
- 3: https://code.claude.com/docs/en/hooks-guide?8adb0641_page=4&cc61befa_page=2&d7430fcd_page=5&r=0
- 4: Agent intermittently emits PowerShell-only syntax and fragile quoting openai/codex#9581
- 5: Windows/pwsh: apply_patch.bat breaks multiline patch arguments; direct codex.exe --codex-run-as-apply-patch succeeds with the same patch openai/codex#13840
- 6: https://help.openai.com/en/articles/11381614-codex-codex-andsign-in-with-chatgpt
- 7: feat: configurable Windows agent shell (powershell/git-bash) openai/codex#16717
- 8: https://cursor.com/docs/hooks
- 9: Cursor on Windows opens hooks/session-start file instead of executing SessionStart hook obra/superpowers#1449
- 10: Cursor on Windows: SessionStart hook fails when PATH bash is WSL; use run-hook.cmd in hooks-cursor.json obra/superpowers#912
- 11: https://dev.to/rosgluk/hermes-agent-cli-cheat-sheet-commands-flags-and-slash-shortcuts-3pcb
- 12: https://nousresearch-hermes-agent.mintlify.app/introduction
🏁 Script executed:
# Check for Windows-specific handling in installer.rs
grep -n "windows\|Windows\|cmd\|powershell\|platform\|cfg" crates/sidecar/src/installer.rs | head -20Repository: NVIDIA/NeMo-Flow
Length of output: 174
🏁 Script executed:
# Search for platform-specific code patterns in the codebase
rg -i "windows|cmd\.exe|powershell" --type rs -A 2 -B 2 | head -50Repository: NVIDIA/NeMo-Flow
Length of output: 87
🏁 Script executed:
# Check if hook_command is used in a context that suggests Windows deployment
rg "hook_command\|hook-forward" --type rs -A 3 -B 3 | head -50Repository: NVIDIA/NeMo-Flow
Length of output: 87
🏁 Script executed:
# Look for any existing quoting/escaping utilities
rg "quote\|escape" crates/sidecar/src/ --type rs | head -20Repository: NVIDIA/NeMo-Flow
Length of output: 87
🏁 Script executed:
# Search for Windows/platform-specific code in Rust files
rg -t rust "windows|cfg|platform" crates/sidecar/ -A 2 -B 2 | head -60Repository: NVIDIA/NeMo-Flow
Length of output: 3170
🏁 Script executed:
# Look for hook_command usage
rg -t rust "hook_command|hook-forward" --max-count 20 -A 3 -B 1Repository: NVIDIA/NeMo-Flow
Length of output: 13107
🏁 Script executed:
# Check the full context of shell_quote function and its callers
cat crates/sidecar/src/installer.rs | sed -n '420,440p'Repository: NVIDIA/NeMo-Flow
Length of output: 963
🏁 Script executed:
# Search for any argv or command array building patterns
rg -t rust "args|argv|Vec\<String\>" crates/sidecar/src/installer.rs -B 2 -A 2 | head -40Repository: NVIDIA/NeMo-Flow
Length of output: 163
Windows hook commands with special characters will fail due to POSIX-only quoting.
The shell_quote function uses POSIX syntax ('...' and '\''), but generated hooks run through different shells on Windows: Git Bash (POSIX), PowerShell, and cmd.exe. Cmd.exe doesn't recognize single quotes and requires ^" or "" escaping; PowerShell uses different quote handling. Any hook command with spaces or quotes in URLs, paths, or JSON—including the --sidecar-url itself—becomes non-executable on Windows systems running cmd.exe or PowerShell.
Since the installer already supports cross-platform home resolution and the codebase includes Windows-specific test paths, platform-aware quoting for shell arguments is needed. Use conditional quoting based on the target environment, or prefer argv-style configuration where the target format permits.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sidecar/src/installer.rs` around lines 372 - 440, The generated hook
commands break on Windows because shell_quote() uses POSIX single-quote
escaping; update quoting to be platform-aware: keep the current POSIX
single-quote logic for non-Windows, and on Windows implement proper double-quote
escaping (wrap in "..." and escape inner quotes and special cmd/powershell
metacharacters) so arguments like --sidecar-url, paths and JSON are usable in
cmd.exe and PowerShell; change shell_quote() accordingly and ensure hook_command
(and helpers push_optional_path/push_optional_gateway_mode) continue to call the
new shell_quote signature so all assembled args are quoted per-platform.
| fn write_planned_file(file: &PlannedFile) -> Result<(), SidecarError> { | ||
| if let Some(parent) = file.path.parent() { | ||
| std::fs::create_dir_all(parent)?; | ||
| } | ||
| if file.path.exists() { | ||
| std::fs::copy(&file.path, backup_path(&file.path)?)?; | ||
| } | ||
| std::fs::write(&file.path, &file.contents)?; | ||
| Ok(()) |
There was a problem hiding this comment.
Write planned files atomically.
This backs up the old file, but std::fs::write still truncates the live config in place. A crash or ENOSPC after truncation leaves the user's hook config corrupted until they manually restore the backup. Write to a temp file in the same directory and rename it over the target after the write succeeds.
Proposed fix
fn write_planned_file(file: &PlannedFile) -> Result<(), SidecarError> {
if let Some(parent) = file.path.parent() {
std::fs::create_dir_all(parent)?;
}
if file.path.exists() {
std::fs::copy(&file.path, backup_path(&file.path)?)?;
}
- std::fs::write(&file.path, &file.contents)?;
+ let tmp_path = file.path.with_extension("tmp");
+ std::fs::write(&tmp_path, &file.contents)?;
+ std::fs::rename(&tmp_path, &file.path)?;
Ok(())
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn write_planned_file(file: &PlannedFile) -> Result<(), SidecarError> { | |
| if let Some(parent) = file.path.parent() { | |
| std::fs::create_dir_all(parent)?; | |
| } | |
| if file.path.exists() { | |
| std::fs::copy(&file.path, backup_path(&file.path)?)?; | |
| } | |
| std::fs::write(&file.path, &file.contents)?; | |
| Ok(()) | |
| fn write_planned_file(file: &PlannedFile) -> Result<(), SidecarError> { | |
| if let Some(parent) = file.path.parent() { | |
| std::fs::create_dir_all(parent)?; | |
| } | |
| if file.path.exists() { | |
| std::fs::copy(&file.path, backup_path(&file.path)?)?; | |
| } | |
| let tmp_path = file.path.with_extension("tmp"); | |
| std::fs::write(&tmp_path, &file.contents)?; | |
| std::fs::rename(&tmp_path, &file.path)?; | |
| Ok(()) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sidecar/src/installer.rs` around lines 655 - 663, The current
write_planned_file function truncates the target in place; change it to write
atomically by creating a temp file in the same directory (use
file.path.parent()), write file.contents into that temp file, flush and persist
it, then std::fs::rename the temp over file.path; keep the existing backup logic
(backup_path(&file.path)?) before replacing, and on success remove the backup if
needed; also preserve permissions/metadata by reading metadata from file.path
(if it exists) and applying them to the temp file before rename, and ensure the
temp file is cleaned up on failure.
| pub(crate) async fn start_llm( | ||
| &self, | ||
| headers: &HeaderMap, | ||
| start: LlmGatewayStart, | ||
| ) -> Result<ActiveLlm, SidecarError> { | ||
| let mut sessions = self.inner.lock().await; | ||
| let config = self.default_config.session_config_from_headers(headers); | ||
| let session_id = start | ||
| .session_id | ||
| .clone() | ||
| .or_else(|| single_active_session_id(&sessions)) | ||
| .unwrap_or_else(|| format!("{}-gateway", AgentKind::Gateway.as_str())); | ||
| let session = sessions | ||
| .entry(session_id.clone()) | ||
| .or_insert_with(|| Session::new(session_id, AgentKind::Gateway, config)); | ||
| session.start_llm(start).await |
There was a problem hiding this comment.
Do Not Reuse One Synthetic Session for All Headerless Gateway Traffic
When session_id is absent and there is not exactly one active hook session, every request falls into the same gateway-gateway bucket. That collapses unrelated traffic into shared correlation state and keeps that synthetic session alive after end_llm, so later requests can inherit stale hints. This fallback needs a per-request synthetic id plus cleanup after the LLM ends when no hook-owned session state exists.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sidecar/src/session.rs` around lines 175 - 190, The code currently
falls back to a constant synthetic id ("gateway-gateway") when headers lack
session_id and there isn't exactly one hook session, causing shared state;
change the fallback to generate a unique per-request synthetic id (e.g., use a
UUID or timestamp) when session_id is None and
single_active_session_id(&sessions) returns None, create the Session via
Session::new as now but mark it as synthetic (or track it locally), and after
calling session.start_llm(start).await ensure cleanup: if the session was
created as a synthetic gateway session and there is no hook-owned state left,
remove that entry from the sessions map (or schedule removal) once the ActiveLlm
ends so synthetic sessions don't persist and leak hints; key symbols: session_id
variable, single_active_session_id, Session::new, session.start_llm, and the
sessions (self.inner) map.
| "hooks": { | ||
| "SessionStart": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "UserPromptSubmit": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "PreToolUse": [ | ||
| { | ||
| "matcher": "*", | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "PostToolUse": [ | ||
| { | ||
| "matcher": "*", | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "PostToolUseFailure": [ | ||
| { | ||
| "matcher": "*", | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "AfterAgentResponse": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "AfterAgentThought": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "SubagentStart": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "SubagentStop": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "Notification": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "Stop": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "PreCompact": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "SessionEnd": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward codex", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ] |
There was a problem hiding this comment.
Fail Open When the Sidecar Binary Is Missing
These hook entries call nemo-flow-sidecar directly, so a machine without that binary now turns ordinary Codex hooks into command failures. That breaks the integration’s optionality contract instead of preserving Codex’s native behavior. Please route these through a shim/launcher that exits successfully when NeMo Flow is unavailable.
As per coding guidelines, **/{integrations,integration,*-integration}/**: Preserve the framework's original behavior when NeMo Flow is absent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@integrations/coding-agents/codex/hooks/hooks.json` around lines 3 - 149, All
hook entries currently invoke the nemo-flow-sidecar binary directly (the
"command" values like "nemo-flow-sidecar hook-forward codex" used in hooks such
as SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PostToolUseFailure,
AfterAgentResponse, AfterAgentThought, SubagentStart/Stop, Notification, Stop,
PreCompact, SessionEnd), which breaks optionality when the binary is missing;
change each hook to call a small launcher/shim (e.g., nemo-flow-sidecar-launcher
or nemo-flow-sidecar-shim) that attempts to exec the real nemo-flow-sidecar but
exits successfully (0) and no-ops when the binary is not present so Codex hooks
preserve native behavior. Ensure every "command" string in the JSON is updated
to the shim name and that the shim implements the fail-open behavior.
| "hooks": { | ||
| "sessionStart": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "beforeSubmitPrompt": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "preToolUse": [ | ||
| { | ||
| "matcher": "*", | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "beforeShellExecution": [ | ||
| { | ||
| "matcher": "*", | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "beforeMCPExecution": [ | ||
| { | ||
| "matcher": "*", | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "postToolUse": [ | ||
| { | ||
| "matcher": "*", | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "afterShellExecution": [ | ||
| { | ||
| "matcher": "*", | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "afterMCPExecution": [ | ||
| { | ||
| "matcher": "*", | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "subagentStart": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "subagentStop": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "afterAgentResponse": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "afterAgentThought": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "preCompact": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "stop": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "sessionEnd": [ | ||
| { | ||
| "hooks": [ | ||
| { | ||
| "type": "command", | ||
| "command": "nemo-flow-sidecar hook-forward cursor", | ||
| "timeout": 30 | ||
| } | ||
| ] | ||
| } | ||
| ] |
There was a problem hiding this comment.
Keep the Cursor Hook Bundle Optional at Runtime
These entries invoke nemo-flow-sidecar unconditionally. If the binary is missing on a synced machine or falls off PATH, Cursor hook execution now fails instead of falling back to Cursor’s normal behavior. Please run this through a fail-open shim/launcher so the integration remains optional.
As per coding guidelines, **/{integrations,integration,*-integration}/**: Preserve the framework's original behavior when NeMo Flow is absent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@integrations/coding-agents/cursor/.cursor/hooks.json` around lines 3 - 174,
The hooks call the nemo-flow-sidecar binary unconditionally (every "command":
"nemo-flow-sidecar hook-forward cursor"), causing failures when the binary is
missing; create a fail-open launcher (e.g., nemo-flow-sidecar-shim) that checks
for the real nemo-flow-sidecar and execs it if present, otherwise exits
successfully (and optionally logs a debug message), then update all hook entries
to invoke the shim (replace "nemo-flow-sidecar hook-forward cursor" with
"nemo-flow-sidecar-shim hook-forward cursor") so the integration remains
optional while preserving original behavior when NeMo Flow is absent.
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/sidecar/src/launcher.rs`:
- Around line 168-179: The current configured_command function uses
command.split_whitespace(), which drops quotes and corrupts quoted paths/args
(e.g., "C:\Program Files\..."), so replace the whitespace-only split with a
quote-aware parser: call a shell-style splitter (e.g., shell_words::split or
shlex::split) on the command string inside configured_command (refer to the
configured_command function and CodingAgent/AgentConfigs lookup), handle the
Result (return None or log/error on parse failure), and return the parsed
Vec<String> only when non-empty; this preserves quoted executables and arguments
on Windows and wrapper commands.
- Around line 482-496: The current insert_after_agent implementation can place
agent flags after argv[0] when the agent was specified via an explicit --agent
flag but no argv element matches CodingAgent::infer, causing wrappers to be
configured instead of the agent; update insert_after_agent (referencing argv,
agent, and CodingAgent::infer) to first search argv for an explicit --agent
occurrence (both "--agent" followed by a value and "--agent=VALUE" forms) and,
if found, insert the generated args immediately after the agent value token (or
the same token for --agent=VALUE); only if no explicit --agent token is present
fall back to finding the last infer-able agent index (as currently done) and
finally fall back to index 0.
- Around line 108-111: The health probe timeout currently discards any actual
server error by ignoring the result of running_server.stop().await; change the
error handling so that when wait_for_health(sidecar_url).await returns
Err(health_err), you call running_server.stop().await and if that returns
Err(server_err) return server_err (or otherwise wrap/preserve both, preferring
server_err) instead of always returning health_err; reference
RunningSidecar::start, wait_for_health, and running_server.stop().await to
locate where to capture and propagate the server-side error.
In `@crates/sidecar/tests/coverage/launcher_tests.rs`:
- Around line 223-252: The test mutates process cwd and currently restores it
with a trailing std::env::set_current_dir(previous) which fails to run on panic;
introduce a small RAII guard (e.g., CwdGuard) that saves the previous cwd on
creation and restores it in Drop, then replace the manual previous/save/restore
pattern in tests like cursor_patching_can_be_disabled (and the other listed
tests) by creating a CwdGuard after setting temp dir (use current_dir_lock() as
before) so the cwd is always restored even if the test panics or returns early;
ensure the guard is used in the scope where PreparedRun::new and assertions run.
In `@docs/integrate-frameworks/coding-agent-codex.md`:
- Around line 89-90: The provider display name currently uses "Nemo Flow"
casing; update the value for the TOML table [model_providers.nemo-flow-openai]
so the name key uses the official product capitalization "NeMo Flow" (e.g.,
change name = "Nemo Flow OpenAI" to name = "NeMo Flow OpenAI") to ensure
consistent product naming across docs.
- Around line 111-114: The docs currently reference deprecated keys
features.codex_hooks and codex_hooks; update the text so the transparent wrapper
sets the CLI override features.hooks=true (instead of features.codex_hooks) and
the persistent install writes .codex/config.toml with hooks = true (instead of
codex_hooks = true), while still describing that generated hook entries are
merged into .codex/hooks.json; update both occurrences to use the canonical
features.hooks and hooks configuration names and leave the hooks.json behavior
unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 7d4ca10f-58e2-4750-9c45-2c77d6195b69
📒 Files selected for processing (4)
crates/sidecar/src/launcher.rscrates/sidecar/tests/coverage/launcher_tests.rsdocs/integrate-frameworks/coding-agent-codex.mdintegrations/coding-agents/codex/README.md
📜 Review details
🧰 Additional context used
📓 Path-based instructions (30)
**/*.{md,rst,html,txt}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)
**/*.{md,rst,html,txt}: Always spellNVIDIAin all caps. Do not useNvidia,nvidia,nVidia,nVIDIA, orNV.
Usean NVIDIAbefore a noun because the name starts with an 'en' sound.
Do not add a registered trademark symbol afterNVIDIAwhen referring to the company.
Use trademark symbols with product names only when the document type or legal guidance requires them.
Verify official capitalization, spacing, and hyphenation for product names.
Precede NVIDIA product names withNVIDIAon first mention when it is natural and accurate.
Do not rewrite product names for grammar or title-case rules.
Preserve third-party product names according to the owner's spelling.
Include the company name and full model qualifier on first use when it helps identify the model.
Preserve the official capitalization and punctuation of model names.
Use shorter family names only after the full name is established.
Spell out a term on first use and put the acronym in parentheses unless the acronym is widely understood by the intended audience.
Use the acronym on later mentions after it has been defined.
For long documents, reintroduce the full term if readers might lose context.
Form plurals of acronyms withs, not an apostrophe, such asGPUs.
In headings, common acronyms can remain abbreviated. Spell out the term in the first or second sentence of the body.
Common terms such asCPU,GPU,PC,API, andUIusually do not need to be spelled out for developer audiences.
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,rst,html}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)
Link the first mention of a product name when the destination helps the reader.
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,rst,txt}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-guide.md)
**/*.{md,rst,txt}: SpellNVIDIAin all caps. Do not useNvidia,nvidia, orNV.
Format commands, code elements, expressions, package names, file names, and paths as inline code.
Use descriptive link text. Avoid raw URLs and weak anchors such as 'here' or 'read more.'
Use title case consistently for technical documentation headings.
Introduce code blocks, lists, tables, and images with complete sentences.
Write procedures as imperative steps. Keep steps parallel and split long procedures into smaller tasks.
Prefer active voice, present tense, short sentences, contractions, and plain English.
Usecanfor possibility and reservemayfor permission.
Useafterfor temporal relationships instead ofonce.
Preferrefer tooverseewhen the wording points readers to another resource.
Avoid culture-specific idioms, unnecessary Latinisms, jokes, and marketing exaggeration in technical documentation.
Spell out months in body text, avoid ordinal dates, and use clear time zones.
Spell out whole numbers from zero through nine unless they are technical values, parameters, versions, or UI values.
Use numerals for 10 or greater and include commas in thousands.
Do not add trademark symbols to learning-oriented documentation unless the source, platform, or legal guidance explicitly requires them.
Do not add trademark symbols to NeMo Flow learning documentation by default.
Do not rewrite API names, package names, command flags, or code literals for style reasons.
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,markdown,rst}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-technical-docs.md)
**/*.{md,markdown,rst}: Use title case consistently in technical documentation headings
Avoid quotation marks, ampersands, and exclamation marks in headings
Keep product, event, research, and whitepaper names in their official title case
Use title case for table headers
Do not force social-media sentence case into technical docs
Use monospace formatting for code elements, commands, parameters, package names, and expressions
Use monospace formatting for directories, file names, and paths
Use angle brackets inside monospace for variables inside paths, such as/home/<username>/.login
Use quotation marks for error messages and strings in documentation
Use bold formatting for UI buttons, menus, fields, and labels in documentation
Use angle brackets between UI labels for menu paths, such as File > Save As
Use italics for new terms on first use in documentation
Use italics for publication titles in documentation
Use plain text formatting for keyboard shortcuts in documentation
Prefer[NVIDIA/NeMo](link)format for GitHub repository references over generic phrases like 'the GitHub repo'
Introduce every code block with a complete sentence
Do not make a code block complete the grammar of the previous sentence
Do not continue a sentence after a code block
Use syntax highlighting when the format supports it for code blocks
Avoid the word 'snippet' unless the surrounding docs already use it as a term of art
Keep inline method, function, and class references consistent with nearby docs, omitting empty parentheses for prose readability when no call is shown
Use descriptive anchor text that matches the destination title when possible for links
Avoid raw URLs in running text in documentation
Avoid generic link anchors such as 'here,' 'this page,' and 'read more' in documentation
Include the acronym in link text if a linked term includes an acronym
Do not link long sentences or multiple sentences in documentation
Avoid links that pull readers away from a procedure unles...
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{html,md}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Include SPDX license headers in HTML and Markdown files using HTML comment syntax
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/README.md
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Update relevant crate or package README when that surface changed
Relevant package or crate README.md files must be updated when examples or binding guidance changes
Files:
integrations/coding-agents/codex/README.md
**/*.md
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Run Markdown link checking via
lycheeforREADME.md,CONTRIBUTING.md, anddocs/through pre-commit hooks
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,markdown,py,sh,bash,js,ts,java,cpp,go,rust}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Keep package names, repo references, and build commands current in documentation
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,markdown,py,sh,bash}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Keep stable user-facing wrappers at scripts/ root in docs and examples; only point at namespaced helper paths when documenting internal maintenance work
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,markdown,py,sh,bash,js,ts,example}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Example commands must match current package names and paths
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/{integrations,integration,*-integration}/**
📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)
**/{integrations,integration,*-integration}/**: Keep NeMo Flow optional in framework integrations
Preserve the framework's original behavior when NeMo Flow is absent
Wrap tool and LLM paths at the correct framework boundary
Integration pattern must followdocs/integrate-frameworks/adding-scopes.md
Files:
integrations/coding-agents/codex/README.md
**/*.{md,txt,rst}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)
**/*.{md,txt,rst}: Ensure commands, package names, file paths, and APIs in documentation are correct and not stale; flag incorrect or outdated information as blocking issues
Ensure examples and procedures in documentation will execute successfully with current APIs and commands
Use consistent user-facing terminology throughout documentation that matches current repo terminology
Capitalize NVIDIA correctly in all documentation and public-facing text
Format code, commands, paths, and filenames as inline code (monospace) in documentation
Use descriptive anchor text for links instead of bare URLs or weak labels like 'here' in documentation
Prefer active voice, present tense, short sentences, and plain English in documentation
Structure documentation procedures as imperative steps that are easy to scan and not too long for a single sequence
Prefer 'after' instead of 'once' for temporal references in documentation
Use 'can' instead of 'may' when describing possibility (rather than permission) in documentation
Avoid ambiguous numeric dates and ordinal dates in documentation body text
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
{README.md,docs/index.md,**/README.md}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)
Update entry-point documentation (README.md, docs/index.md, package READMEs, binding-level source READMEs) whenever public behavior changes
Files:
integrations/coding-agents/codex/README.md
**/*.{rs,py,js,ts,go,sh,toml,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (AGENTS.md)
Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.mdcrates/sidecar/src/launcher.rscrates/sidecar/tests/coverage/launcher_tests.rs
**/*.{py,js,ts,tsx,go,rs,md}
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
Format changed files with the language-native formatter before the final lint/test pass
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.mdcrates/sidecar/src/launcher.rscrates/sidecar/tests/coverage/launcher_tests.rs
docs/**/*.md
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Run
./scripts/build-docs.shfor documentation site changes
docs/**/*.md: Relevant getting-started or reference docs must be updated when examples change
Release-policy docs must point to GitHub Releases as the only release-history source of truth
docs/**/*.md: Use title case for headings in technical documentation
Introduce code blocks, tables, and lists with complete lead-in sentences in documentation
Files:
docs/integrate-frameworks/coding-agent-codex.md
{RELEASING.md,CHANGELOG.md,docs/**/*.md}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Keep release-process and release-notes guidance in repo-maintainer docs such as RELEASING.md, not as user-facing docs pages or CHANGELOG.md
Files:
docs/integrate-frameworks/coding-agent-codex.md
docs/integrate-frameworks/**/*.md
📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)
Documentation must be updated if activation or usage of the integration changed
Files:
docs/integrate-frameworks/coding-agent-codex.md
{scripts/*.sh,docs/**/*.md}
📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)
Use root
./scripts/*.shcommands in docs and contributor guidance as documented, with implementations underscripts/third-party/
Files:
docs/integrate-frameworks/coding-agent-codex.md
{docs/**,examples/**,crates/adaptive/**,python/nemo_flow/**,go/nemo_flow/**,**/{example,component}.{ts,tsx,js,rs,py,go}}
📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)
Any new adaptive component kind must have documentation, examples, and binding coverage across all supported languages
Files:
docs/integrate-frameworks/coding-agent-codex.md
{README*,CHANGELOG*,docs/**/*.{md,rst,txt},examples/**/*,*.md}
📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)
Update documentation, examples, and getting-started guides with new package/module/crate names after rename operations
Files:
docs/integrate-frameworks/coding-agent-codex.md
{README.md,docs/**/*.md,examples/**/*.{js,ts,py,go,rs}}
📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)
Keep documentation and examples synchronized with current install, import, and build commands
Files:
docs/integrate-frameworks/coding-agent-codex.md
{README.md,CONTRIBUTING.md,docs/**/*.md}
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
For docs-only changes, run targeted checks only if commands, package names, or examples changed. Use
just docsfor docs-site builds andjust docs-linkcheckwhen links changed
Files:
docs/integrate-frameworks/coding-agent-codex.md
{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}
⚙️ CodeRabbit configuration file
{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}: Review documentation for technical accuracy against the current API, command correctness, and consistency across language bindings.
Flag stale examples, missing SPDX headers where required, and instructions that no longer match CI or pre-commit behavior.
Files:
docs/integrate-frameworks/coding-agent-codex.md
**/*.rs
📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)
**/*.rs: Runcargo fmt --allfor FFI work as it is Rust work
Runjust test-rustfor FFI validation
Runcargo clippy --workspace --all-targets -- -D warningsto enforce warnings-as-errors lintingWhen Rust files changed as part of Python work, run
cargo fmt --all,just test-rust, andcargo clippy --workspace --all-targets -- -D warnings
**/*.rs: Runcargo fmt --allfor Rust code formatting
Runcargo clippy --workspace --all-targets -- -D warningsto enforce Rust linting with no warnings
Runjust test-rustas the shared-runtime build/test wrapper for Rust changesUse Rust snake_case naming convention for Rust code
**/*.rs: Any Rust change must runjust test-rust
Any Rust change must runcargo fmt --all
Any Rust change must runcargo clippy --workspace --all-targets -- -D warnings
**/*.rs: Runcargo fmt --allfor Rust code formatting when Node changes touch Rust files
Runcargo clippy --workspace --all-targets -- -D warningsto enforce strict linting when Rust files changed as part of Node work
**/*.rs: Always runjust test-rustwhen any Rust code changes
Always runcargo fmt --allwhen any Rust code changes
Always runcargo clippy --workspace --all-targets -- -D warningswhen any Rust code changes
Files:
crates/sidecar/src/launcher.rscrates/sidecar/tests/coverage/launcher_tests.rs
**/*.{rs,go,js,ts,tsx}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Include SPDX license headers in all Rust, Go, JavaScript, and TypeScript source files using C-style comment syntax
Files:
crates/sidecar/src/launcher.rscrates/sidecar/tests/coverage/launcher_tests.rs
**/*.{rs,py,go,js,ts,tsx}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Use
SONAR_IGNORE_START/SONAR_IGNORE_ENDmarkers only for documented false positives that cannot be resolved in code; keep ignored blocks small, add explanatory comments, and require reviewer sign-off
Files:
crates/sidecar/src/launcher.rscrates/sidecar/tests/coverage/launcher_tests.rs
**/*.{js,ts,tsx,jsx,py,rs,go,java,c,cpp,h,cc,cxx,cs,rb,php,swift,kt}
📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)
Changed files must be formatted with the language-native formatter
Files:
crates/sidecar/src/launcher.rscrates/sidecar/tests/coverage/launcher_tests.rs
**/*.{rs,py,js,ts,tsx,go}
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
During iteration, prefer
uv run pre-commit run --files <changed files...>for targeted validation
Files:
crates/sidecar/src/launcher.rscrates/sidecar/tests/coverage/launcher_tests.rs
{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}
⚙️ CodeRabbit configuration file
{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.
Files:
crates/sidecar/tests/coverage/launcher_tests.rs
🔇 Additional comments (2)
integrations/coding-agents/codex/README.md (2)
103-104: Use Official Product Capitalization in Provider NameThe user-facing provider label should use
NeMo Flow, notNemo Flow.Suggested doc fix
-name = "Nemo Flow OpenAI" +name = "NeMo Flow OpenAI"As per coding guidelines, "Verify official capitalization, spacing, and hyphenation for product names."
33-35: ⚡ Quick winDocumentation is correct as written.
The configuration key documented in lines 34–35 is
codex_hooks = trueunder the[features]table in.codex/config.toml, which matches the canonical syntax for codex-cli >= 0.129.0. The suggested change tofeatures.hooks = trueis incorrect; that key does not exist.
| let running_server = RunningSidecar::start(listener, sidecar_config); | ||
| if let Err(error) = wait_for_health(sidecar_url).await { | ||
| let _ = running_server.stop().await; | ||
| return Err(error); |
There was a problem hiding this comment.
Surface the real sidecar startup failure.
If serve_listener exits immediately, the health probe times out and let _ = running_server.stop().await; discards the actual server error. That turns startup/config failures into a misleading readiness timeout.
Suggested fix
let running_server = RunningSidecar::start(listener, sidecar_config);
if let Err(error) = wait_for_health(sidecar_url).await {
- let _ = running_server.stop().await;
- return Err(error);
+ return match running_server.stop().await {
+ Ok(()) => Err(error),
+ Err(server_error) => Err(SidecarError::Launch(format!(
+ "{server_error}; health check also failed: {error}"
+ ))),
+ };
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sidecar/src/launcher.rs` around lines 108 - 111, The health probe
timeout currently discards any actual server error by ignoring the result of
running_server.stop().await; change the error handling so that when
wait_for_health(sidecar_url).await returns Err(health_err), you call
running_server.stop().await and if that returns Err(server_err) return
server_err (or otherwise wrap/preserve both, preferring server_err) instead of
always returning health_err; reference RunningSidecar::start, wait_for_health,
and running_server.stop().await to locate where to capture and propagate the
server-side error.
| // Splits a configured command string into argv words for run mode. This intentionally uses simple | ||
| // whitespace splitting because config command values are a convenience fallback; complex shell | ||
| // commands should be passed after `--` by the caller. | ||
| fn configured_command(agent: CodingAgent, agents: &AgentConfigs) -> Option<Vec<String>> { | ||
| let command = match agent { | ||
| CodingAgent::ClaudeCode => agents.claude_code.command.as_ref(), | ||
| CodingAgent::Codex => agents.codex.command.as_ref(), | ||
| CodingAgent::Cursor => agents.cursor.command.as_ref(), | ||
| CodingAgent::Hermes => agents.hermes.command.as_ref(), | ||
| }?; | ||
| let argv: Vec<_> = command.split_whitespace().map(ToOwned::to_owned).collect(); | ||
| (!argv.is_empty()).then_some(argv) |
There was a problem hiding this comment.
split_whitespace() breaks configured commands.
Quoted executables and arguments are corrupted here, so a configured command like "C:\Program Files\Codex\codex.exe" --full-auto becomes invalid argv. That makes the configured-command fallback unreliable on Windows and for wrapper commands.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sidecar/src/launcher.rs` around lines 168 - 179, The current
configured_command function uses command.split_whitespace(), which drops quotes
and corrupts quoted paths/args (e.g., "C:\Program Files\..."), so replace the
whitespace-only split with a quote-aware parser: call a shell-style splitter
(e.g., shell_words::split or shlex::split) on the command string inside
configured_command (refer to the configured_command function and
CodingAgent/AgentConfigs lookup), handle the Result (return None or log/error on
parse failure), and return the parsed Vec<String> only when non-empty; this
preserves quoted executables and arguments on Windows and wrapper commands.
| // Inserts generated agent flags immediately after the last argv element that looks like the agent | ||
| // executable. Falling back to index 0 keeps wrapper commands usable by inserting after the first | ||
| // word when the agent cannot be found later in argv. | ||
| fn insert_after_agent( | ||
| argv: &mut Vec<String>, | ||
| agent: CodingAgent, | ||
| args: impl IntoIterator<Item = String>, | ||
| ) { | ||
| let index = argv | ||
| .iter() | ||
| .enumerate() | ||
| .filter_map(|(index, arg)| (CodingAgent::infer(arg) == Some(agent)).then_some(index)) | ||
| .next_back() | ||
| .unwrap_or(0); | ||
| argv.splice(index + 1..index + 1, args); |
There was a problem hiding this comment.
Opaque wrappers do not actually receive agent flags correctly.
When --agent is explicit but no argv element is recognizable as that agent, this inserts Codex/Claude flags after argv[0]. That means run --agent codex -- my-wrapper ... configures the wrapper process, not Codex, unless the wrapper forwards arbitrary argv unchanged. The contract in the preceding comment is stronger than what this implementation can guarantee.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sidecar/src/launcher.rs` around lines 482 - 496, The current
insert_after_agent implementation can place agent flags after argv[0] when the
agent was specified via an explicit --agent flag but no argv element matches
CodingAgent::infer, causing wrappers to be configured instead of the agent;
update insert_after_agent (referencing argv, agent, and CodingAgent::infer) to
first search argv for an explicit --agent occurrence (both "--agent" followed by
a value and "--agent=VALUE" forms) and, if found, insert the generated args
immediately after the agent value token (or the same token for --agent=VALUE);
only if no explicit --agent token is present fall back to finding the last
infer-able agent index (as currently done) and finally fall back to index 0.
| #[test] | ||
| fn cursor_patching_can_be_disabled() { | ||
| let _guard = current_dir_lock().lock().unwrap(); | ||
| let temp = tempfile::tempdir().unwrap(); | ||
| let previous = std::env::current_dir().unwrap(); | ||
| std::env::set_current_dir(temp.path()).unwrap(); | ||
| let resolved = ResolvedConfig { | ||
| sidecar: SidecarConfig::default(), | ||
| agents: AgentConfigs { | ||
| cursor: CursorAgentConfig { | ||
| command: None, | ||
| patch_restore_hooks: false, | ||
| }, | ||
| ..AgentConfigs::default() | ||
| }, | ||
| }; | ||
|
|
||
| let prepared = PreparedRun::new( | ||
| CodingAgent::Cursor, | ||
| vec!["cursor-agent".into()], | ||
| "http://s", | ||
| &resolved, | ||
| false, | ||
| ) | ||
| .unwrap(); | ||
|
|
||
| assert!(prepared.cursor_restore.is_none()); | ||
| assert!(!Path::new(".cursor/hooks.json").exists()); | ||
| std::env::set_current_dir(previous).unwrap(); | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Restore cwd with Drop, not a trailing statement.
These tests mutate process-global cwd and only reset it on the happy path. Any panic before the final set_current_dir(previous) leaves later tests running in the wrong directory.
Suggested pattern
+struct CwdGuard(PathBuf);
+
+impl Drop for CwdGuard {
+ fn drop(&mut self) {
+ let _ = std::env::set_current_dir(&self.0);
+ }
+}
+
#[test]
fn cursor_patching_can_be_disabled() {
let _guard = current_dir_lock().lock().unwrap();
let temp = tempfile::tempdir().unwrap();
- let previous = std::env::current_dir().unwrap();
+ let _cwd_guard = CwdGuard(std::env::current_dir().unwrap());
std::env::set_current_dir(temp.path()).unwrap();
// ...
- std::env::set_current_dir(previous).unwrap();
}As per coding guidelines, "Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant."
Also applies to: 313-351, 353-388, 390-413, 474-497
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sidecar/tests/coverage/launcher_tests.rs` around lines 223 - 252, The
test mutates process cwd and currently restores it with a trailing
std::env::set_current_dir(previous) which fails to run on panic; introduce a
small RAII guard (e.g., CwdGuard) that saves the previous cwd on creation and
restores it in Drop, then replace the manual previous/save/restore pattern in
tests like cursor_patching_can_be_disabled (and the other listed tests) by
creating a CwdGuard after setting temp dir (use current_dir_lock() as before) so
the cwd is always restored even if the test panics or returns early; ensure the
guard is used in the scope where PreparedRun::new and assertions run.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
crates/sidecar/src/gateway.rs (2)
545-549:⚠️ Potential issue | 🟠 Major | ⚡ Quick winStop recording
cookieand generic API-key aliases.
should_record_headerstill allows secrets likecookie,api-key, and similar aliases intoLlmRequest.headers, which then flow into downstream exports.Proposed fix
fn should_record_header(name: &HeaderName) -> bool { should_forward_request_header(name) && name != http::header::AUTHORIZATION + && name != http::header::COOKIE && name.as_str() != "x-api-key" + && name.as_str() != "api-key" + && name.as_str() != "apikey" && name.as_str() != "anthropic-api-key" }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/gateway.rs` around lines 545 - 549, The should_record_header function currently permits secret headers like "cookie" and generic API-key aliases; update should_record_header (the function taking &HeaderName) to explicitly exclude cookie-related headers ("cookie" and "set-cookie") and common API-key variants ("api-key", "apikey", and any other vendor aliases such as "x-api-key" and "anthropic-api-key") and perform comparisons case-insensitively (e.g., use name.as_str().eq_ignore_ascii_case) while keeping the existing should_forward_request_header and the authorization check.
503-565:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHonor
Connectionheader tokens when filtering headers.The proxy still strips only the fixed hop-by-hop names. If a request or response carries
Connection: x-foo,x-foois still eligible for forwarding and recording becauseis_hop_by_hopnever inspects the current header map. That breaks proxy semantics and can leak per-connection metadata.Parse the message’s
Connectionheader into a per-request/per-response exclusion set and apply it in both the request and response filters.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/sidecar/src/gateway.rs` around lines 503 - 565, The code currently strips only fixed hop-by-hop header names; you must honor per-message Connection tokens by parsing the request/response HeaderMap's "connection" header into a set of token names and treating those tokens as hop-by-hop too. Update response_headers to first collect connection tokens from the provided HeaderMap and exclude any header whose name matches those tokens (in addition to calling is_hop_by_hop), and change the request-side predicates should_forward_request_header and should_record_header to accept either the HeaderMap or a precomputed set of connection token names so they can exclude those tokens as well; keep the existing fixed-name checks (HOST, CONTENT_LENGTH, ACCEPT_ENCODING, AUTHORIZATION, x-api-key, anthropic-api-key) and reuse or refactor is_hop_by_hop into a helper that only tests fixed names while combining it with the per-message token set when filtering.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/sidecar/src/gateway.rs`:
- Around line 245-259: The code currently calls upstream_response.bytes() and
fully buffers the upstream body (see upstream_response.bytes(), the bytes
variable and response_json usage), which can OOM; replace this with a shared
bounded reader that enforces a maximum buffer size (e.g., a configured
MAX_BODY_BYTES) and choose an explicit overflow policy: either fail-fast by
returning a SidecarError::UpstreamOverflow and emitting end_llm(active, {
"error": "body_size_limit" }, { "http_status": status.as_u16(), "streaming":
false, "gateway_error": true, "stage": "body" }), or switch to passthrough
streaming by stopping buffering once the cap is hit and proxying the remaining
bytes directly to the client; implement the cap logic where bytes() is called
and also apply the same bounded-reader behavior to the other occurrence (around
lines referenced 375-378) so both non-streaming paths share the same
bounded-read helper and overflow handling.
---
Duplicate comments:
In `@crates/sidecar/src/gateway.rs`:
- Around line 545-549: The should_record_header function currently permits
secret headers like "cookie" and generic API-key aliases; update
should_record_header (the function taking &HeaderName) to explicitly exclude
cookie-related headers ("cookie" and "set-cookie") and common API-key variants
("api-key", "apikey", and any other vendor aliases such as "x-api-key" and
"anthropic-api-key") and perform comparisons case-insensitively (e.g., use
name.as_str().eq_ignore_ascii_case) while keeping the existing
should_forward_request_header and the authorization check.
- Around line 503-565: The code currently strips only fixed hop-by-hop header
names; you must honor per-message Connection tokens by parsing the
request/response HeaderMap's "connection" header into a set of token names and
treating those tokens as hop-by-hop too. Update response_headers to first
collect connection tokens from the provided HeaderMap and exclude any header
whose name matches those tokens (in addition to calling is_hop_by_hop), and
change the request-side predicates should_forward_request_header and
should_record_header to accept either the HeaderMap or a precomputed set of
connection token names so they can exclude those tokens as well; keep the
existing fixed-name checks (HOST, CONTENT_LENGTH, ACCEPT_ENCODING,
AUTHORIZATION, x-api-key, anthropic-api-key) and reuse or refactor is_hop_by_hop
into a helper that only tests fixed names while combining it with the
per-message token set when filtering.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: a870b963-f006-4ee5-8568-231bf1fa4f3e
📒 Files selected for processing (1)
crates/sidecar/src/gateway.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (7)
**/*.rs
📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)
**/*.rs: Runcargo fmt --allfor FFI work as it is Rust work
Runjust test-rustfor FFI validation
Runcargo clippy --workspace --all-targets -- -D warningsto enforce warnings-as-errors lintingWhen Rust files changed as part of Python work, run
cargo fmt --all,just test-rust, andcargo clippy --workspace --all-targets -- -D warnings
**/*.rs: Runcargo fmt --allfor Rust code formatting
Runcargo clippy --workspace --all-targets -- -D warningsto enforce Rust linting with no warnings
Runjust test-rustas the shared-runtime build/test wrapper for Rust changesUse Rust snake_case naming convention for Rust code
**/*.rs: Any Rust change must runjust test-rust
Any Rust change must runcargo fmt --all
Any Rust change must runcargo clippy --workspace --all-targets -- -D warnings
**/*.rs: Runcargo fmt --allfor Rust code formatting when Node changes touch Rust files
Runcargo clippy --workspace --all-targets -- -D warningsto enforce strict linting when Rust files changed as part of Node work
**/*.rs: Always runjust test-rustwhen any Rust code changes
Always runcargo fmt --allwhen any Rust code changes
Always runcargo clippy --workspace --all-targets -- -D warningswhen any Rust code changes
Files:
crates/sidecar/src/gateway.rs
**/*.{rs,go,js,ts,tsx}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Include SPDX license headers in all Rust, Go, JavaScript, and TypeScript source files using C-style comment syntax
Files:
crates/sidecar/src/gateway.rs
**/*.{rs,py,go,js,ts,tsx}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Use
SONAR_IGNORE_START/SONAR_IGNORE_ENDmarkers only for documented false positives that cannot be resolved in code; keep ignored blocks small, add explanatory comments, and require reviewer sign-off
Files:
crates/sidecar/src/gateway.rs
**/*.{js,ts,tsx,jsx,py,rs,go,java,c,cpp,h,cc,cxx,cs,rb,php,swift,kt}
📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)
Changed files must be formatted with the language-native formatter
Files:
crates/sidecar/src/gateway.rs
**/*.{rs,py,js,ts,go,sh,toml,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (AGENTS.md)
Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.
Files:
crates/sidecar/src/gateway.rs
**/*.{py,js,ts,tsx,go,rs,md}
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
Format changed files with the language-native formatter before the final lint/test pass
Files:
crates/sidecar/src/gateway.rs
**/*.{rs,py,js,ts,tsx,go}
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
During iteration, prefer
uv run pre-commit run --files <changed files...>for targeted validation
Files:
crates/sidecar/src/gateway.rs
| let bytes = match upstream_response.bytes().await { | ||
| Ok(bytes) => bytes, | ||
| Err(error) => { | ||
| sessions | ||
| .end_llm( | ||
| active, | ||
| json!({ "error": error.to_string() }), | ||
| json!({ "http_status": status.as_u16(), "streaming": false, "gateway_error": true, "stage": "body" }), | ||
| ) | ||
| .await?; | ||
| return Err(SidecarError::Upstream(error)); | ||
| } | ||
| }; | ||
| let response_json = serde_json::from_slice::<Value>(&bytes) | ||
| .unwrap_or_else(|_| json!({ "body_bytes": bytes.len() })); |
There was a problem hiding this comment.
Cap buffered upstream responses.
Both non-streaming paths call reqwest::Response::bytes() and materialize the entire upstream body in memory. A large or malicious upstream response can OOM the sidecar before you ever emit the end event or proxy the payload back.
Please put these paths behind a shared bounded reader and decide on an overflow behavior explicitly: fail fast, or stop recording once the cap is reached and switch to passthrough streaming for the client response.
Also applies to: 375-378
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sidecar/src/gateway.rs` around lines 245 - 259, The code currently
calls upstream_response.bytes() and fully buffers the upstream body (see
upstream_response.bytes(), the bytes variable and response_json usage), which
can OOM; replace this with a shared bounded reader that enforces a maximum
buffer size (e.g., a configured MAX_BODY_BYTES) and choose an explicit overflow
policy: either fail-fast by returning a SidecarError::UpstreamOverflow and
emitting end_llm(active, { "error": "body_size_limit" }, { "http_status":
status.as_u16(), "streaming": false, "gateway_error": true, "stage": "body" }),
or switch to passthrough streaming by stopping buffering once the cap is hit and
proxying the remaining bytes directly to the client; implement the cap logic
where bytes() is called and also apply the same bounded-reader behavior to the
other occurrence (around lines referenced 375-378) so both non-streaming paths
share the same bounded-read helper and overflow handling.
Updated the Codex configuration to replace the deprecated `features.codex_hooks` with `features.hooks`, requiring `codex-cli >= 0.129.0`. Adjusted related tests and documentation to reflect this change. Signed-off-by: Ajay Thorve <athorve@nvidia.com>
Without this, upstream providers honored the client's Accept-Encoding and returned gzip/br/zstd-encoded bodies. The bytes were forwarded to the client unchanged (which decoded normally), but the sidecar's observability capture stored the still-compressed bytes in `output.value` on LLM spans and in ATIF trajectory bodies, leaving both unreadable for any downstream consumer (Phoenix, ATIF, ATOF). Stripping the header forces upstreams to return identity-encoded bodies. The bandwidth cost is paid only on the sidecar-upstream hop; the client never asked for the encoding it would have received, so its decoders never trigger. Closes NMF-84. Signed-off-by: Ajay Thorve <athorve@nvidia.com>
74189e8 to
eb16758
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/sidecar/src/launcher.rs`:
- Around line 559-563: The toml_string function only escapes backslashes and
quotes, which can produce invalid TOML for values containing control characters
(newlines, tabs, etc.); update toml_string to produce correct TOML basic-string
escaping by either (1) replacing it with
toml::Value::String(value.to_string()).to_string() if the toml crate is
available, or (2) extend the escaping logic in toml_string to handle all
TOML-required escapes (e.g., \n, \t, \r, \uXXXX for control characters) so that
values like base_url always serialize into valid TOML.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: d340a09b-2be8-4446-841d-36be1689170c
📒 Files selected for processing (5)
crates/sidecar/src/gateway.rscrates/sidecar/src/launcher.rscrates/sidecar/tests/coverage/launcher_tests.rsdocs/integrate-frameworks/coding-agent-codex.mdintegrations/coding-agents/codex/README.md
📜 Review details
🧰 Additional context used
📓 Path-based instructions (30)
**/*.{md,rst,html,txt}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)
**/*.{md,rst,html,txt}: Always spellNVIDIAin all caps. Do not useNvidia,nvidia,nVidia,nVIDIA, orNV.
Usean NVIDIAbefore a noun because the name starts with an 'en' sound.
Do not add a registered trademark symbol afterNVIDIAwhen referring to the company.
Use trademark symbols with product names only when the document type or legal guidance requires them.
Verify official capitalization, spacing, and hyphenation for product names.
Precede NVIDIA product names withNVIDIAon first mention when it is natural and accurate.
Do not rewrite product names for grammar or title-case rules.
Preserve third-party product names according to the owner's spelling.
Include the company name and full model qualifier on first use when it helps identify the model.
Preserve the official capitalization and punctuation of model names.
Use shorter family names only after the full name is established.
Spell out a term on first use and put the acronym in parentheses unless the acronym is widely understood by the intended audience.
Use the acronym on later mentions after it has been defined.
For long documents, reintroduce the full term if readers might lose context.
Form plurals of acronyms withs, not an apostrophe, such asGPUs.
In headings, common acronyms can remain abbreviated. Spell out the term in the first or second sentence of the body.
Common terms such asCPU,GPU,PC,API, andUIusually do not need to be spelled out for developer audiences.
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,rst,html}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)
Link the first mention of a product name when the destination helps the reader.
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,rst,txt}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-guide.md)
**/*.{md,rst,txt}: SpellNVIDIAin all caps. Do not useNvidia,nvidia, orNV.
Format commands, code elements, expressions, package names, file names, and paths as inline code.
Use descriptive link text. Avoid raw URLs and weak anchors such as 'here' or 'read more.'
Use title case consistently for technical documentation headings.
Introduce code blocks, lists, tables, and images with complete sentences.
Write procedures as imperative steps. Keep steps parallel and split long procedures into smaller tasks.
Prefer active voice, present tense, short sentences, contractions, and plain English.
Usecanfor possibility and reservemayfor permission.
Useafterfor temporal relationships instead ofonce.
Preferrefer tooverseewhen the wording points readers to another resource.
Avoid culture-specific idioms, unnecessary Latinisms, jokes, and marketing exaggeration in technical documentation.
Spell out months in body text, avoid ordinal dates, and use clear time zones.
Spell out whole numbers from zero through nine unless they are technical values, parameters, versions, or UI values.
Use numerals for 10 or greater and include commas in thousands.
Do not add trademark symbols to learning-oriented documentation unless the source, platform, or legal guidance explicitly requires them.
Do not add trademark symbols to NeMo Flow learning documentation by default.
Do not rewrite API names, package names, command flags, or code literals for style reasons.
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,markdown,rst}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-technical-docs.md)
**/*.{md,markdown,rst}: Use title case consistently in technical documentation headings
Avoid quotation marks, ampersands, and exclamation marks in headings
Keep product, event, research, and whitepaper names in their official title case
Use title case for table headers
Do not force social-media sentence case into technical docs
Use monospace formatting for code elements, commands, parameters, package names, and expressions
Use monospace formatting for directories, file names, and paths
Use angle brackets inside monospace for variables inside paths, such as/home/<username>/.login
Use quotation marks for error messages and strings in documentation
Use bold formatting for UI buttons, menus, fields, and labels in documentation
Use angle brackets between UI labels for menu paths, such as File > Save As
Use italics for new terms on first use in documentation
Use italics for publication titles in documentation
Use plain text formatting for keyboard shortcuts in documentation
Prefer[NVIDIA/NeMo](link)format for GitHub repository references over generic phrases like 'the GitHub repo'
Introduce every code block with a complete sentence
Do not make a code block complete the grammar of the previous sentence
Do not continue a sentence after a code block
Use syntax highlighting when the format supports it for code blocks
Avoid the word 'snippet' unless the surrounding docs already use it as a term of art
Keep inline method, function, and class references consistent with nearby docs, omitting empty parentheses for prose readability when no call is shown
Use descriptive anchor text that matches the destination title when possible for links
Avoid raw URLs in running text in documentation
Avoid generic link anchors such as 'here,' 'this page,' and 'read more' in documentation
Include the acronym in link text if a linked term includes an acronym
Do not link long sentences or multiple sentences in documentation
Avoid links that pull readers away from a procedure unles...
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{html,md}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Include SPDX license headers in HTML and Markdown files using HTML comment syntax
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/README.md
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Update relevant crate or package README when that surface changed
Relevant package or crate README.md files must be updated when examples or binding guidance changes
Files:
integrations/coding-agents/codex/README.md
**/*.md
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Run Markdown link checking via
lycheeforREADME.md,CONTRIBUTING.md, anddocs/through pre-commit hooks
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,markdown,py,sh,bash,js,ts,java,cpp,go,rust}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Keep package names, repo references, and build commands current in documentation
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,markdown,py,sh,bash}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Keep stable user-facing wrappers at scripts/ root in docs and examples; only point at namespaced helper paths when documenting internal maintenance work
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/*.{md,markdown,py,sh,bash,js,ts,example}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Example commands must match current package names and paths
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
**/{integrations,integration,*-integration}/**
📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)
**/{integrations,integration,*-integration}/**: Keep NeMo Flow optional in framework integrations
Preserve the framework's original behavior when NeMo Flow is absent
Wrap tool and LLM paths at the correct framework boundary
Integration pattern must followdocs/integrate-frameworks/adding-scopes.md
Files:
integrations/coding-agents/codex/README.md
**/*.{md,txt,rst}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)
**/*.{md,txt,rst}: Ensure commands, package names, file paths, and APIs in documentation are correct and not stale; flag incorrect or outdated information as blocking issues
Ensure examples and procedures in documentation will execute successfully with current APIs and commands
Use consistent user-facing terminology throughout documentation that matches current repo terminology
Capitalize NVIDIA correctly in all documentation and public-facing text
Format code, commands, paths, and filenames as inline code (monospace) in documentation
Use descriptive anchor text for links instead of bare URLs or weak labels like 'here' in documentation
Prefer active voice, present tense, short sentences, and plain English in documentation
Structure documentation procedures as imperative steps that are easy to scan and not too long for a single sequence
Prefer 'after' instead of 'once' for temporal references in documentation
Use 'can' instead of 'may' when describing possibility (rather than permission) in documentation
Avoid ambiguous numeric dates and ordinal dates in documentation body text
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.md
{README.md,docs/index.md,**/README.md}
📄 CodeRabbit inference engine (.agents/skills/review-doc-style/SKILL.md)
Update entry-point documentation (README.md, docs/index.md, package READMEs, binding-level source READMEs) whenever public behavior changes
Files:
integrations/coding-agents/codex/README.md
**/*.{rs,py,js,ts,go,sh,toml,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (AGENTS.md)
Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.mdcrates/sidecar/src/gateway.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/launcher.rs
**/*.{py,js,ts,tsx,go,rs,md}
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
Format changed files with the language-native formatter before the final lint/test pass
Files:
integrations/coding-agents/codex/README.mddocs/integrate-frameworks/coding-agent-codex.mdcrates/sidecar/src/gateway.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/launcher.rs
docs/**/*.md
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Run
./scripts/build-docs.shfor documentation site changes
docs/**/*.md: Relevant getting-started or reference docs must be updated when examples change
Release-policy docs must point to GitHub Releases as the only release-history source of truth
docs/**/*.md: Use title case for headings in technical documentation
Introduce code blocks, tables, and lists with complete lead-in sentences in documentation
Files:
docs/integrate-frameworks/coding-agent-codex.md
{RELEASING.md,CHANGELOG.md,docs/**/*.md}
📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)
Keep release-process and release-notes guidance in repo-maintainer docs such as RELEASING.md, not as user-facing docs pages or CHANGELOG.md
Files:
docs/integrate-frameworks/coding-agent-codex.md
docs/integrate-frameworks/**/*.md
📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)
Documentation must be updated if activation or usage of the integration changed
Files:
docs/integrate-frameworks/coding-agent-codex.md
{scripts/*.sh,docs/**/*.md}
📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)
Use root
./scripts/*.shcommands in docs and contributor guidance as documented, with implementations underscripts/third-party/
Files:
docs/integrate-frameworks/coding-agent-codex.md
{docs/**,examples/**,crates/adaptive/**,python/nemo_flow/**,go/nemo_flow/**,**/{example,component}.{ts,tsx,js,rs,py,go}}
📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)
Any new adaptive component kind must have documentation, examples, and binding coverage across all supported languages
Files:
docs/integrate-frameworks/coding-agent-codex.md
{README*,CHANGELOG*,docs/**/*.{md,rst,txt},examples/**/*,*.md}
📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)
Update documentation, examples, and getting-started guides with new package/module/crate names after rename operations
Files:
docs/integrate-frameworks/coding-agent-codex.md
{README.md,docs/**/*.md,examples/**/*.{js,ts,py,go,rs}}
📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)
Keep documentation and examples synchronized with current install, import, and build commands
Files:
docs/integrate-frameworks/coding-agent-codex.md
{README.md,CONTRIBUTING.md,docs/**/*.md}
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
For docs-only changes, run targeted checks only if commands, package names, or examples changed. Use
just docsfor docs-site builds andjust docs-linkcheckwhen links changed
Files:
docs/integrate-frameworks/coding-agent-codex.md
{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}
⚙️ CodeRabbit configuration file
{docs/**,README.md,CONTRIBUTING.md,RELEASING.md,SECURITY.md}: Review documentation for technical accuracy against the current API, command correctness, and consistency across language bindings.
Flag stale examples, missing SPDX headers where required, and instructions that no longer match CI or pre-commit behavior.
Files:
docs/integrate-frameworks/coding-agent-codex.md
**/*.rs
📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)
**/*.rs: Runcargo fmt --allfor FFI work as it is Rust work
Runjust test-rustfor FFI validation
Runcargo clippy --workspace --all-targets -- -D warningsto enforce warnings-as-errors lintingWhen Rust files changed as part of Python work, run
cargo fmt --all,just test-rust, andcargo clippy --workspace --all-targets -- -D warnings
**/*.rs: Runcargo fmt --allfor Rust code formatting
Runcargo clippy --workspace --all-targets -- -D warningsto enforce Rust linting with no warnings
Runjust test-rustas the shared-runtime build/test wrapper for Rust changesUse Rust snake_case naming convention for Rust code
**/*.rs: Any Rust change must runjust test-rust
Any Rust change must runcargo fmt --all
Any Rust change must runcargo clippy --workspace --all-targets -- -D warnings
**/*.rs: Runcargo fmt --allfor Rust code formatting when Node changes touch Rust files
Runcargo clippy --workspace --all-targets -- -D warningsto enforce strict linting when Rust files changed as part of Node work
**/*.rs: Always runjust test-rustwhen any Rust code changes
Always runcargo fmt --allwhen any Rust code changes
Always runcargo clippy --workspace --all-targets -- -D warningswhen any Rust code changes
Files:
crates/sidecar/src/gateway.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/launcher.rs
**/*.{rs,go,js,ts,tsx}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Include SPDX license headers in all Rust, Go, JavaScript, and TypeScript source files using C-style comment syntax
Files:
crates/sidecar/src/gateway.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/launcher.rs
**/*.{rs,py,go,js,ts,tsx}
📄 CodeRabbit inference engine (CONTRIBUTING.md)
Use
SONAR_IGNORE_START/SONAR_IGNORE_ENDmarkers only for documented false positives that cannot be resolved in code; keep ignored blocks small, add explanatory comments, and require reviewer sign-off
Files:
crates/sidecar/src/gateway.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/launcher.rs
**/*.{js,ts,tsx,jsx,py,rs,go,java,c,cpp,h,cc,cxx,cs,rb,php,swift,kt}
📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)
Changed files must be formatted with the language-native formatter
Files:
crates/sidecar/src/gateway.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/launcher.rs
**/*.{rs,py,js,ts,tsx,go}
📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)
During iteration, prefer
uv run pre-commit run --files <changed files...>for targeted validation
Files:
crates/sidecar/src/gateway.rscrates/sidecar/tests/coverage/launcher_tests.rscrates/sidecar/src/launcher.rs
{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}
⚙️ CodeRabbit configuration file
{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.
Files:
crates/sidecar/tests/coverage/launcher_tests.rs
🔇 Additional comments (12)
crates/sidecar/src/gateway.rs (4)
245-246: Upstream response bodies are still fully buffered without a cap.Line 245 and Line 378 still call
upstream_response.bytes().await, which can OOM on large/malicious upstream payloads. This was previously raised and remains unresolved.Also applies to: 378-378
503-510: Connection-token hop-by-hop headers are not stripped.Filtering only fixed hop-by-hop names misses headers listed dynamically in
Connection, so those can still be forwarded/returned. This was previously raised and still applies.Also applies to: 530-532, 554-565
545-550: Sensitive header recording filter is still incomplete.
should_record_headerstill allows values likecookieand generic API-key aliases to be recorded. This was already flagged and is still unresolved.
16-17: Good fix: request-body reads are now bounded.Using
MAX_BODY_BYTESwithto_bytescloses the prior unbounded-request-body risk cleanly.Also applies to: 68-70
crates/sidecar/src/launcher.rs (3)
475-480: Product name miscapitalized in user-facing Codex provider config.Line 477 uses
"Nemo Flow OpenAI"but the product name is"NeMo Flow OpenAI"(capital M).- "model_providers.nemo-flow-openai={{name=\"Nemo Flow OpenAI\",base_url={},wire_api=\"responses\",requires_openai_auth=true,supports_websockets=false}}", + "model_providers.nemo-flow-openai={{name=\"NeMo Flow OpenAI\",base_url={},wire_api=\"responses\",requires_openai_auth=true,supports_websockets=false}}",
108-111: Server startup error discarded on health probe timeout.When
wait_for_healthfails,let _ = running_server.stop().awaitdiscards any real server error, turning startup/config failures into a misleading readiness timeout message.Suggested fix
let running_server = RunningSidecar::start(listener, sidecar_config); if let Err(error) = wait_for_health(sidecar_url).await { - let _ = running_server.stop().await; - return Err(error); + return match running_server.stop().await { + Ok(()) => Err(error), + Err(server_error) => Err(SidecarError::Launch(format!( + "{server_error}; health check also failed: {error}" + ))), + }; }
168-180:split_whitespace()breaks quoted paths in configured commands.Paths with spaces (e.g.,
"C:\Program Files\Codex\codex.exe" --full-auto) will be corrupted. The comment acknowledges this as a convenience tradeoff, but consider using a shell-style splitter likeshlex::splitfor cross-platform robustness.integrations/coding-agents/codex/README.md (1)
103-104: Product name miscapitalized.Line 104 uses
"Nemo Flow OpenAI"but the product name is"NeMo Flow OpenAI".-name = "Nemo Flow OpenAI" +name = "NeMo Flow OpenAI"docs/integrate-frameworks/coding-agent-codex.md (2)
89-90: Product name miscapitalized.Line 90 uses
"Nemo Flow OpenAI"but the product name is"NeMo Flow OpenAI".-name = "Nemo Flow OpenAI" +name = "NeMo Flow OpenAI"
111-114: Documentation references deprecated configuration keys.Lines 112-113 document
features.codex_hooks=trueandcodex_hooks = true, but the implementation atlauncher.rs:321uses the canonicalfeatures.hooks=true. This inconsistency will confuse users.Suggested fix
The transparent wrapper passes hook entries as Codex CLI config overrides and -sets `features.codex_hooks=true` for that launched process. Persistent install -writes `.codex/config.toml` with `codex_hooks = true` and merges generated hook +sets `features.hooks=true` for that launched process. Persistent install +writes `.codex/config.toml` with `hooks = true` and merges generated hook entries into `.codex/hooks.json`.crates/sidecar/tests/coverage/launcher_tests.rs (2)
223-252: Process cwd restoration is not panic-safe.Tests that mutate the process working directory (lines 227-228, 251) restore it with a trailing
set_current_dir(previous). If the test panics before reaching line 251, subsequent tests run in the wrong directory. Use a RAII guard pattern.Suggested pattern
struct CwdGuard(PathBuf); impl Drop for CwdGuard { fn drop(&mut self) { let _ = std::env::set_current_dir(&self.0); } } #[test] fn cursor_patching_can_be_disabled() { let _guard = current_dir_lock().lock().unwrap(); let temp = tempfile::tempdir().unwrap(); let _cwd_guard = CwdGuard(std::env::current_dir().unwrap()); std::env::set_current_dir(temp.path()).unwrap(); // ... test body ... // No manual restore needed; CwdGuard::drop handles it }This pattern should be applied to all affected tests:
cursor_patching_can_be_disabled,cursor_patch_restore_restores_original_file,cursor_patch_restore_uses_nearest_project_cursor_dir,cursor_patch_restore_removes_temporary_file, andcursor_dry_run_does_not_write_hooks.
1-593: LGTM on test coverage.The test suite comprehensively covers agent inference, configured command resolution, agent-specific preparation (Codex, Claude, Hermes, Cursor), cursor hook patch/restore scenarios, dry-run behavior, and async runtime behavior including sidecar URL injection and exit code propagation. The OS-specific fake agent commands are well-structured for cross-platform testing.
| // Escapes a Rust string as a TOML basic string for inline Codex configuration values. | ||
| fn toml_string(value: &str) -> String { | ||
| let escaped = value.replace('\\', "\\\\").replace('"', "\\\""); | ||
| format!("\"{escaped}\"") | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
TOML string escaping is minimal.
toml_string escapes only \ and ". TOML basic strings also require escaping control characters (newlines, tabs, etc.). While unlikely in URLs, a base_url containing such characters would produce invalid TOML.
Consider using the toml crate for serialization
If the toml crate is already a dependency, use its serialization to guarantee correctness:
fn toml_string(value: &str) -> String {
toml::Value::String(value.to_string()).to_string()
}Otherwise, the current implementation is acceptable for typical URL values.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sidecar/src/launcher.rs` around lines 559 - 563, The toml_string
function only escapes backslashes and quotes, which can produce invalid TOML for
values containing control characters (newlines, tabs, etc.); update toml_string
to produce correct TOML basic-string escaping by either (1) replacing it with
toml::Value::String(value.to_string()).to_string() if the toml crate is
available, or (2) extend the escaping logic in toml_string to handle all
TOML-required escapes (e.g., \n, \t, \r, \uXXXX for control characters) so that
values like base_url always serialize into valid TOML.
…odex)
Three independent issues silently disabled transparent-run hooks for
Claude Code; this commit fixes all three:
1. Hook subprocesses couldn't find the sidecar binary on PATH.
`hook_forward_command` returned the bare name `nemo-flow-sidecar`,
which Claude/Codex/Cursor passed to their hook subprocesses. Those
subprocesses inherit the agent's environment, where the sidecar's
install location (e.g. target/debug) typically isn't on PATH, so
every hook exited with status 127 (command not found). Add an
`executable` parameter so transparent-run callers can pass
`current_exe()`; persistent install keeps the bare name.
2. Claude Code 2.1.x's hook loader strict-rejects unknown event names —
any single unknown event causes the ENTIRE hooks file to be rejected
silently, disabling all hooks. We were injecting
`AfterAgentResponse` and `AfterAgentThought`, neither in Claude's
whitelist. Drop both. Also add `PermissionRequest` and `PostCompact`,
which are in Claude's whitelist and in Codex 0.129's event surface.
3. Claude's hook output validator rejects null for optional string
fields. The Claude adapter's Stop response carried `stopReason: null`
and was rejected with a runtime validation error. Omit the field
entirely; the response collapses to the generic `{ continue: true }`.
Also regenerate the static integration bundles for claude-code and
codex with the corrected event list. The same whitelist mismatch broke
persistent install for Claude users.
Verified end-to-end against Claude Code 2.1.137:
- SessionStart, UserPromptSubmit, PreToolUse:WebSearch,
PostToolUse:WebSearch, Stop, SubagentStop all fire successfully
- Phoenix shows [AGENT] claude-code root span with tool spans
- ATIF written on session-end (1.7 MB / 1.3 MB across two sessions)
Codex 0.129 lifecycle (SessionStart through PreToolUse/PostToolUse) now
also fires correctly — Phoenix shows [AGENT] codex root with Bash tool
spans. Codex transparent-run ATIF flush still pending: codex 0.129 has
no SessionEnd-equivalent hook, so end_agent never runs without a
wrapper-driven flush. Tracking that piece separately.
Signed-off-by: Ajay Thorve <athorve@nvidia.com>
…gateway" Previously, when a real agent's gateway request beat its SessionStart hook, `SessionManager::start_llm` created the session with `AgentKind::Gateway`. Observer identities (ATIF agent name, Phoenix scope name) are baked at scope-open time inside `ensure_agent_started`, so the session stayed labeled "gateway" for its entire lifetime even after SessionStart arrived. Result: Phoenix root spans for real Claude/Codex sessions read as `[AGENT] gateway` instead of `[AGENT] claude-code` / `[AGENT] codex`. Two targeted fixes: 1. In `SessionManager::start_llm`, infer the agent kind from the gateway provider when creating a new session: `anthropic.messages` and `anthropic.count_tokens` → `ClaudeCode`; `openai.responses` → `Codex`. Unknown providers (e.g., `openai.chat_completions`, `openai.models`) keep the legacy `Gateway` label so synthetic gateway-only sessions for unattributed proxy traffic remain semantically intact. 2. In `SessionManager::apply_events`, when an `AgentStarted` event arrives for an existing session whose `agent_kind` is still `Gateway` (i.e., the gateway path created the session before the hook), upgrade the session's `agent_kind` to the event's real kind. This fixes session-level metadata for the rare case where provider inference fell back to `Gateway` (e.g., chat-completions traffic) but a SessionStart hook later identifies the agent. Tests cover all three branches: anthropic gateway → claude-code label, openai.responses gateway → codex label, ambiguous provider → gateway label preserved. Signed-off-by: Ajay Thorve <athorve@nvidia.com>
…nd hook flag Three small fixes from CodeRabbit's pass on PR NVIDIA#63: 1. `Nemo Flow OpenAI` → `NeMo Flow OpenAI` in the codex transparent-run provider config (`launcher.rs:477`) and matching docs in `docs/integrate-frameworks/coding-agent-codex.md` and `integrations/coding-agents/codex/README.md`. Use the official NeMo Flow capitalization in user-facing TOML config and docs. 2. Update codex hook configuration documentation to use the canonical `features.hooks` / `hooks = true` keys instead of the legacy `features.codex_hooks` / `codex_hooks = true`. The legacy spelling is still accepted as an alias on codex 0.129+ but the canonical form is what new users should configure. Retain a one-line note that the old spelling is the legacy alias so users on older codex still understand the deprecation path. Resolves CodeRabbit comments 3211156569, 3211760910, 3211760919, 3211156599. No code behavior changes; 128 sidecar tests still pass. Signed-off-by: Ajay Thorve <athorve@nvidia.com>
Codex 0.129's hook surface has no `SessionEnd`-equivalent event — its closest session-termination signal, `Stop`, fires per-conversation-turn rather than at session end. As a result, codex transparent runs through `nemo-flow-sidecar run -- codex` produced empty `.nemo-flow/atif/` directories: the existing `flush_observers` path (`end_agent` → `write_atif`) never fired. Add a new `NormalizedEvent::TurnEnded` variant. When `classify` sees a `Stop` hook AND the primary classification is not already terminal, it emits both the existing `LlmHint` (preserving correlation behavior for subsequent LLM calls) AND a `TurnEnded` event. `Session::apply` routes `TurnEnded` to a new `snapshot_atif` method that writes the cumulative ATIF without closing the agent scope or shutting down the OpenInference subscriber. The next turn keeps adding events; the final Stop's snapshot is the complete trajectory. The `is_terminal()` guard is critical: Cursor's adapter classifies `stop` as `AgentEnded`, so without the guard we'd emit `[AgentEnded, TurnEnded]` — the first call tears down the session and writes ATIF, the second creates an empty synthetic session and overwrites the freshly-written ATIF with an empty trajectory. Safety verified by reading `AtifExporter`: `export()` takes `&self`, documented as "does not clear the buffered events." Multiple snapshot writes are cumulative supersets, last-write-wins yields the complete trajectory. E2E verified against codex 0.129: - Real codex session writes a 1.6 MB ATIF on Ctrl+D exit (was 0 bytes before) - Multi-turn captured (5 steps: 2 user, 2 agent, 1 system) - `agent.name = "codex"` (from earlier provider-inference fix) - Phoenix root remains `[AGENT] codex` with proper nested LLM/tool spans Hermes is unaffected — it has no `Stop` hook event. Partially closes NMF-91. The remaining cross-process subagent unification gap in NMF-91 stays upstream-blocked: codex spawns subagents as separate codex processes with no `SubagentStart` hook, so each subagent's gateway-only synthetic session is still its own Phoenix trace and its own ATIF (or no ATIF when no Stop fires for it). Tests: - adapter: codex Stop emits TurnEnded - adapter: claude Stop emits TurnEnded - adapter: cursor stop does NOT also emit TurnEnded (regression: would wipe ATIF) - session: TurnEnded snapshots ATIF without closing scope; cumulative across turns - session: TurnEnded is a no-op when the session has no agent scope Signed-off-by: Ajay Thorve <athorve@nvidia.com>
Defines the streaming counterpart to LlmResponseCodec, used with
llm_stream_call_execute. Each provider's StreamingCodec impl owns the
incremental state needed to assemble per-chunk events into a single
non-streaming-shape JSON payload, which the matching LlmResponseCodec
then decodes into AnnotatedLlmResponse.
Per Will's design suggestion in NMF-90:
trait StreamingCodec {
fn collector(&self) -> LlmCollectorFn;
fn finalizer(&self) -> LlmFinalizerFn;
}
Includes an SseEventDecoder utility that incrementally splits a
text/event-stream byte stream into parsed SSE events. Anthropic
Messages, OpenAI Responses, and OpenAI Chat Completions all emit one
JSON object per data: line, so the decoder is shared infrastructure
for the per-provider StreamingCodec impls landing in follow-up commits.
Refs NMF-90.
Signed-off-by: Ajay Thorve <athorve@nvidia.com>
Implements StreamingCodec for the Anthropic Messages SSE event sequence. The collector observes message_start, content_block_start/delta/stop, message_delta, and message_stop events, accumulating per-block state (text, tool input partial_json, citations) keyed by content_block index. The finalizer produces a JSON object wire-compatible with RawAnthropicResponse — the same shape Anthropic returns for a non-streaming Messages request — so the assembled output can be fed back through AnthropicMessagesCodec::decode_response to produce the canonical AnnotatedLlmResponse. Streaming and non-streaming Anthropic requests now converge on the same observability output without per-route shape duplication. Internal state lives behind Arc<Mutex<...>> so the &self-produced collector and finalizer closures share access. Each instance is single-use because LlmFinalizerFn consumes the finalize step. Tests cover: - text-only response with usage and stop_reason mapping - tool_use input assembled from partial_json fragments - web_search_tool_result blocks (full content delivered at start) - text blocks with citations_delta accumulation - truncated streams falling back to raw partial_json string Refs NMF-90. Signed-off-by: Ajay Thorve <athorve@nvidia.com>
Implements StreamingCodec for the OpenAI Responses SSE event sequence. The Responses API streams either complete `response` snapshots (response.created, response.in_progress, response.completed, response.failed, response.incomplete) or final-state output items (response.output_item.done), so the codec: 1. Tracks the latest response snapshot — terminal events typically carry the complete state including `output`. 2. Tracks output items by output_index from output_item.done events, used as a fallback when the terminal snapshot lacks `output`. 3. Ignores per-token deltas (output_text.delta, function_call_arguments.delta) because their content is redelivered in the matching output_item.done. The finalizer produces a JSON object wire-compatible with RawResponsesResponse — feed it back through OpenAIResponsesCodec:: decode_response to produce the canonical AnnotatedLlmResponse. Tests cover: - terminal response.completed carries full state (common case) - terminal completed lacks output → fall back to output_item.done items - response.incomplete with reason maps to FinishReason::Length - per-token deltas don't double-accumulate Refs NMF-90. Signed-off-by: Ajay Thorve <athorve@nvidia.com>
Implements StreamingCodec for the OpenAI Chat Completions SSE chunk sequence. Each delta chunk may carry role (typically only the first), incremental content fragments, partial tool_calls whose function.arguments stream as a string, or a finish_reason terminator. Top-level fields (id, object, created, model) are repeated on every chunk; we capture them once. Final-chunk usage is preserved when emitted (only sent when stream_options.include_usage is set). The finalizer produces a JSON object wire-compatible with a non-streaming Chat Completions response — feed it back through OpenAIChatCodec::decode_response to produce the canonical AnnotatedLlmResponse. The `object` field is normalized from `chat.completion.chunk` to `chat.completion` so the assembled JSON matches what a non-streaming response would carry. Also: SseEventDecoder now drops the `data: [DONE]` end-of-stream sentinel that OpenAI Chat Completions emits. It's a wire-level marker, not a JSON payload, and would otherwise fail JSON parsing. Tests cover: - text-only response with usage and finish_reason mapping - tool_call arguments assembled from string fragments - content: null preserved when only tool calls streamed - multiple choices (n > 1 requests) tracked separately - intermediate null usage chunks don't overwrite real usage Refs NMF-90. Signed-off-by: Ajay Thorve <athorve@nvidia.com>
Match the project's existing test layout: every other codec file in crates/core/src/codec/ pulls its tests from crates/core/tests/unit/codec/<name>_tests.rs via #[path]. The streaming codec tests were inlined in src/codec/streaming.rs in the initial commit; move them out to follow the same convention so source files stay slim and all codec test files cluster under tests/unit/codec/. No behavior change. 283 core lib tests still passing. Refs NMF-90. Signed-off-by: Ajay Thorve <athorve@nvidia.com>
Refactors the gateway proxy to use llm_call_execute and
llm_stream_call_execute so the runtime owns LLM start/end events,
codec annotation, and stream lifecycle. Per-route dispatch wires
Anthropic Messages, OpenAI Responses, and OpenAI Chat Completions to
their existing typed codecs and the new streaming codecs landed
earlier in this PR.
A new SessionManager::prepare_gateway_call helper resolves the right
session, opens the agent scope, and computes correlation metadata
without holding the session lock during the upstream HTTP work. The
buffered path returns the captured upstream bytes verbatim; the
streaming path re-encodes parsed events back into SSE frames for the
client (Option B). Upstream connection failures are still surfaced as
502 Bad Gateway via a side channel that captures the original
reqwest error before the runtime collapses it to FlowError::Internal.
Trade-off A vs B for streaming response forwarding:
A. Byte tee: split the raw upstream byte stream between the client
and a side capture that feeds the runtime collector. Bytes pass
through unchanged, but observability hooks see the literal wire
bytes (gzip/identity) rather than parsed events, and the
collector code has to duplicate per-provider SSE framing logic
just for capture.
B. Re-encode (chosen): parse SSE on the upstream side, feed
parsed JSON events to the runtime collector, then re-encode
events as SSE for the client. Adds one parse + one serialize
per chunk, but the runtime sees identical event shapes for
streaming and non-streaming requests, and chunk-level
observability matches the codec's annotated response.
The legacy SessionManager::start_llm / end_llm path is retained behind
cfg(test) so existing correlation unit tests continue to exercise
resolve_llm_owner. Streaming tool-hint extraction is intentionally
deferred because the runtime synthesizes the aggregate response inside
LlmStreamWrapper and does not surface it back to the gateway.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Ajay Thorve <athorve@nvidia.com>
Salonijain27
left a comment
There was a problem hiding this comment.
Approved from a dependency point of view
Overview
Adds the Rust
nemo-flow-sidecarbinary and coding-agent integration surface for observing Claude Code, Codex, Cursor, and Hermes agent activity through hook ingestion, transparent sidecar execution, passthrough LLM gateway routes, ATIF export, and OpenInference export. The branch now covers the sidecar runtime, installer behavior, agent-specific hook bundles/docs, Hermes-specific shell hook support, best-effort LLM/tool correlation, token metrics extraction, and CI/release plumbing for publishing the sidecar crate and binary artifacts.Details
nemo-flow-sidecarworkspace crate with CLI entry points forserve,run,install, andhook-forward.NEMO_FLOW_SIDECAR_URLhandling for ephemeral sidecar runs.integrations/coding-agents/; Hermes is configured through the sidecar installer rather than a static bundle.Validation recorded for this PR:
cargo test -p nemo-flow-sidecarPATH="/opt/homebrew/bin:$PATH" /opt/homebrew/bin/just docscargo fmt --allcargo test -p nemo-flow-sidecar launcher::tests::run_starts_sidecar_injects_env_and_returns_agent_exit_codejust test-rust,just docs, andcargo clippy -p nemo-flow-sidecar --all-targets -- -D warnings.lycheeskipped because it still fails on unrelated generated Rust API doc links indocs/reference/api/rust/_generated/...; cargo fmt, cargo clippy, and cargo check passed with PyO3 pointed at Python 3.12.f5f9a385, which includes the follow-up fix commit23315c09plus the currentmainmerge; GitHub CI is rerunning for this head.Where should the reviewer start?
Start with
crates/sidecar/src/session.rsfor lifecycle normalization and export behavior, thencrates/sidecar/src/gateway.rsandcrates/sidecar/src/server.rsfor the HTTP/gateway surface. For installation and user-facing behavior, reviewcrates/sidecar/src/installer.rs,docs/integrate-frameworks/coding-agent-sidecar.md, andintegrations/coding-agents/README.md.Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Improvements
Documentation
Tests
Chores