FunctionGemma runtime config + CUDA build hardening + ServiceHost#46
Conversation
…, drop flash-attn - runtime/src/config.rs: +120 lines of configuration surface (api_key, queue depth, request timeout, cuda mem-pool knobs); wire through error.rs, handlers.rs, types.rs. - core/Cargo.toml: pin cudarc 0.19.4 -> 0.19.0 (build-system CUDA match). - runtime/Cargo.toml: drop the flash-attn feature from rust-functiongemma-core (keep nvml) to stabilize the CUDA build. - Config/pcai-functiongemma.json + runtime README aligned with the new config. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Invoke-RustBuild.ps1: +37 lines of build-path/feature handling. - Initialize-CudaEnvironment.ps1: CUDA env tweaks. - Tests/Unit: extend CargoTools + Invoke-RustBuild Pester coverage. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Program.cs: +19 lines (host wiring). - PcaiServiceHost.csproj: dependency/version bump. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces several updates across the repository, including inlining configuration structs in the FunctionGemma runtime, upgrading System.Text.Json in the service host, refining process management for winsocat proxies, and updating PowerShell build scripts and tests. The code review feedback highlights critical resource management issues in Program.cs where retrieved Process instances should be disposed to prevent handle leaks. Additionally, feedback on Invoke-RustBuild.ps1 suggests resetting $global:LASTEXITCODE to avoid false build failures from stale exit codes and using .ProviderPath instead of .Path to prevent issues with provider-prefixed paths in native commands.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
Pull request overview
This PR extends the FunctionGemma router runtime’s configuration surface (including auth + queue/timeout + CUDA memory knobs), hardens the Rust/CUDA build helper scripts and their Pester coverage, and tightens PcaiServiceHost HVSOCK process management while updating its JSON dependency.
Changes:
- Expanded
rust-functiongemma-runtimeJSON/runtime config schema and wiring (env overrides, api key, queue depth, request timeout, CUDA mem-pool settings). - Hardened
Tools/Invoke-RustBuild.ps1argument/exit-code behavior and aligned CUDA environment defaults + unit tests. - Improved
PcaiServiceHostHVSOCK proxy stop/status logic and updatedSystem.Text.Jsonpackage version.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| Tools/Invoke-RustBuild.ps1 | Adds -LlmOutput, validates cargo args earlier, resolves working dir explicitly, and improves CargoTools exit-code handling. |
| Tools/Initialize-CudaEnvironment.ps1 | Adjusts preferred CUDA version ordering/default warning messaging (v13.1). |
| Tests/Unit/Invoke-RustBuild.Tests.ps1 | Updates unit assertions to cover new switches/behavior and regex fixes. |
| Tests/Unit/CargoTools.Tests.ps1 | Aligns CargoTools function naming expectations and strengthens module export assertions. |
| Native/PcaiServiceHost/Program.cs | Makes JSON deserialization case-insensitive and tightens HVSOCK proxy PID/process-name checks. |
| Native/PcaiServiceHost/PcaiServiceHost.csproj | Bumps System.Text.Json dependency version. |
| Deploy/rust-functiongemma-runtime/src/types.rs | Adds #[expect(dead_code)] annotation for schema fields not yet used by router logic. |
| Deploy/rust-functiongemma-runtime/src/handlers.rs | Simplifies tool_choice object validation with a guarded match arm. |
| Deploy/rust-functiongemma-runtime/src/error.rs | Adds a reserved internal error constructor with an explicit lint expectation. |
| Deploy/rust-functiongemma-runtime/src/config.rs | Reworks runtime config structs/defaults, config loading, env overrides, and API key handling. |
| Deploy/rust-functiongemma-runtime/README.md | Updates model path guidance to prefer repo-relative Models\\... and notes stale workstation path. |
| Deploy/rust-functiongemma-runtime/Cargo.toml | Removes flash-attn feature from runtime’s core dependency features (keeps nvml). |
| Deploy/rust-functiongemma-core/Cargo.toml | Pins cudarc to 0.19.0 for CUDA build stability. |
| Config/pcai-functiongemma.json | Updates runtime.router_model_path to a directory-style path for model engine usage. |
Comments suppressed due to low confidence (1)
Deploy/rust-functiongemma-runtime/src/config.rs:141
- The API key resolution order currently prefers
runtime.api_keyfrom the JSON file over thePCAI_API_KEYenvironment variable. For secrets, it’s safer to prefer env-vars first (reduces risk of accidentally committing credentials and makes it easier to override in CI/ServiceHost deployments).
/// Returns the configured API key, or `None` when authentication is disabled.
///
/// Resolution order (first non-empty value wins):
/// 1. `runtime.api_key` in the JSON config file.
/// 2. `PCAI_API_KEY` environment variable.
///
/// Returns `None` if both sources are empty, meaning all requests are allowed
/// without an `Authorization` header.
pub(crate) fn api_key() -> Option<String> {
let from_config = runtime_config().api_key.trim().to_string();
if !from_config.is_empty() {
return Some(from_config);
}
let from_env = std::env::var("PCAI_API_KEY").unwrap_or_default();
if !from_env.trim().is_empty() {
return Some(from_env.trim().to_string());
}
None
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 970775b1b9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…er tests Cover the runtime config surface and CUDA env init introduced in this branch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Flushes pending pre-existing code changes (not authored this session; committed at the user's request) into staged commits.
Commits
feat(functiongemma)— expand runtime config surface (+120 lines inconfig.rs: api_key, queue depth, request timeout, cuda mem-pool knobs), wired through error/handlers/types; pincudarc 0.19.4 → 0.19.0; dropflash-attnfeature (keep nvml) to stabilize the CUDA build; alignpcai-functiongemma.json+ README.chore(build)— hardenInvoke-RustBuild.ps1+Initialize-CudaEnvironment.ps1; expandTests/Unit(CargoTools, Invoke-RustBuild) Pester coverage.chore(servicehost)—PcaiServiceHostProgram.cs host wiring +.csprojdep bump.Notes
// pragma: allowlist secretannotations on theapi_keyconfig field/default — git-guard's secret scanner false-positived on the field name (value is empty/loaded at runtime, no literal secret). This is the hook-sanctioned exemption, not a--no-verifybypass.🤖 Generated with Claude Code