Skip to content
This repository was archived by the owner on Jun 2, 2026. It is now read-only.

feat: Add tray-level firmware update target selection in Tray FW Update API#541

Merged
kunzhao-nv merged 3 commits into
mainfrom
feat/firmware-component-selection
May 20, 2026
Merged

feat: Add tray-level firmware update target selection in Tray FW Update API#541
kunzhao-nv merged 3 commits into
mainfrom
feat/firmware-component-selection

Conversation

@kunzhao-nv

@kunzhao-nv kunzhao-nv commented May 16, 2026

Copy link
Copy Markdown
Contributor

Description

Adds an optional targets field to the tray firmware update endpoints (PATCH /v2/org/{org}/nico/tray/{id}/firmware, batch variant PATCH /v2/org/{org}/nico/tray/firmware) so callers can restrict an upgrade to a subset of firmware sub-parts within each targeted tray (e.g. bmc + nvos on a switch tray, pmc on a powershelf tray) instead of always applying the full bundle.

Names are lowercase and must match Core's per-tray-type component enums in carbide-core/crates/rpc/proto/forge.proto. The currently accepted set per tray type (tracks Core as new enum values are added; verified by completeness-guard tests in flow/pkg/common/firmwarecomponents):

  • switch trays (NvSwitchComponent): bmc, cpld, bios, nvos
  • powershelf trays (PowerShelfComponent): pmc, psu
  • compute trays (ComputeTrayComponent): bmc, bioscurrently NOT honored end-to-end: the NICo compute-firmware path goes through SetFirmwareUpdateTimeWindow + SetMachineAutoUpdate, which has no per-sub-target selection. The request is accepted and logged but the whole bundle is applied. Will be honored once compute moves to UpdateComponentFirmware.

Omitted/empty preserves the historical "update everything in the bundle" behavior. Unknown names are rejected at the Flow layer with a deterministic expected one of: ... message. Rack endpoints intentionally stay unchanged for this round — rack firmware requests are already fanned out to individual tray operations downstream.

Validation: targets requires version to be set, and entries must be non-empty; both checks return 400.

Example:

PATCH /v2/org/{org}/nico/tray/firmware
{
  "siteId": "550e8400-e29b-41d4-a716-446655440000",
  "version": "24.11.0",
  "targets": ["bmc", "nvos"]
}

Naming

The field is named targets at REST (not components) to avoid colliding with carbide's tray-level Component entity. The corresponding Flow gRPC field is named sub_targets (not components) to avoid colliding with OperationTargetSpec.components, which selects tray INSTANCES on the same request. Inside Flow Go the field is FirmwareControlTaskInfo.SubTargets. The REST targets → Flow sub_targets mapping is a single pass-through assignment in common.ExecuteFirmwareUpdateWorkflow.

Type of Change

  • Feature - New feature or functionality (feat:)
  • Fix - Bug fixes (fix:)
  • Chore - Modification or removal of existing functionality (chore:)
  • Refactor - Refactoring of existing functionality (refactor:)
  • Docs - Changes in documentation or OpenAPI schema (docs:)
  • CI - Changes in GitHub workflows. Requires additional scrutiny (ci:)
  • Version - Issuing a new release version (version:)

Services Affected

  • API - API models or endpoints updated
  • Workflow - Workflow service updated
  • DB - DB DAOs or migrations updated
  • Site Manager - Site Manager updated
  • Cert Manager - Cert Manager updated
  • Site Agent - Site Agent updated
  • Flow - Flow service updated
  • Powershelf Manager - Powershelf Manager updated
  • NVSwitch Manager - NVSwitch Manager updated

Related Issues (Optional)

NVIDIA/infra-controller#2084

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

@kunzhao-nv kunzhao-nv requested a review from a team as a code owner May 16, 2026 00:10
@copy-pr-bot

copy-pr-bot Bot commented May 16, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented May 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds optional per-tray firmware sub-target selection ("targets") from REST requests through workflow/proto into flow task info, a new firmwarecomponents parsing library, and device managers; OpenAPI and SDK models and tests updated.

Changes

Selective Firmware Component Updates

Layer / File(s) Summary
Request models and validation
api/pkg/api/model/firmware.go, api/pkg/api/model/firmware_test.go
APIUpdateFirmwareRequest and APIBatchTrayFirmwareUpdateRequest add optional targets fields; validation requires non-empty version when targets is non-empty and rejects empty-string targets.
REST handler integration
api/pkg/api/handler/rack.go, api/pkg/api/handler/tray.go
Handlers now forward Targets (or nil) into ExecuteFirmwareUpdateWorkflow calls.
Execute workflow signature and request mapping
api/pkg/api/handler/util/common/common.go
ExecuteFirmwareUpdateWorkflow adds a targets []string parameter and sets UpgradeFirmwareRequest.SubTargets from it.
Proto and workflow schema
flow/proto/v1/flow.proto, workflow-schema/flow/proto/v1/flow.proto
UpgradeFirmwareRequest adds repeated string sub_targets; workflow-schema proto go_package updated to new import path.
Flow service and converter
flow/internal/service/server_impl.go, flow/internal/converter/protobuf/converter.go
Flow server and protobuf converter populate FirmwareControlTaskInfo.SubTargets from incoming UpgradeFirmwareRequest.sub_targets.
Task info field
flow/internal/task/operations/operations.go
FirmwareControlTaskInfo gains SubTargets []string (sub_targets,omitempty) with documentation describing semantics.
Firmware component parsing library
flow/pkg/common/firmwarecomponents/firmwarecomponents.go, flow/pkg/common/firmwarecomponents/firmwarecomponents_test.go
New package parses lowercase REST targets into device proto enums, normalizes input, produces deterministic unknown-name errors with suggestions, and exposes supported-name helpers plus completeness tests.
NVLSwitch component support (NICo & NSM)
flow/internal/task/componentmanager/nvlswitch/nico/nico.go, flow/internal/task/componentmanager/nvlswitch/nvswitchmanager/nvswitchmanager.go
NICo and NSM NVLSwitch code parse SubTargets, log sub_targets, abort on parse error, and pass parsed components into update requests instead of nil.
PowerShelf component support (NICo & PSM)
flow/internal/task/componentmanager/powershelf/nico/nico.go, flow/internal/task/componentmanager/powershelf/psm/psm.go
NICo/PSM parse SubTargets into component lists; empty parsed list falls back to PMC-only; parsed components are included in update requests; parse errors return immediately.
Compute firmware handling (NICo)
flow/internal/task/componentmanager/compute/nico/nico.go, flow/internal/task/componentmanager/compute/nico/nico_test.go
Compute manager logs sub_targets and warns selection isn't honored yet (full bundle applied); tests ensure nil/empty/non-empty SubTargets do not error.
OpenAPI and SDK models
openapi/spec.yaml, sdk/standard/model_firmware_update_request.go, sdk/standard/model_batch_tray_firmware_update_request.go
OpenAPI adds targets arrays and examples; SDK models add Targets field, accessors, presence helpers, and conditional JSON serialization.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 56.41% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding optional target selection to tray firmware update endpoints, with clear focus on the REST API feature.
Description check ✅ Passed The description is comprehensive and well-structured, explaining the feature intent, per-tray component mappings, validation rules, naming conventions, and current limitations with compute targets.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/firmware-component-selection

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
flow/internal/task/componentmanager/nvlswitch/nico/components_test.go (1)

29-52: ⚡ Quick win

Add parity test cases for empty slice and normalization behavior.

parseNVSwitchComponentsPb currently normalizes input (trim + lowercase), but this suite only validates nil, exact lowercase, and unknown values. Please add explicit cases for []string{} and mixed-case/whitespace inputs to lock in that contract and prevent regressions.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@flow/internal/task/componentmanager/nvlswitch/nico/components_test.go` around
lines 29 - 52, Add two tests to TestParseNVSwitchComponentsPb: one t.Run titled
like "empty slice returns nil (proto: all)" that calls
parseNVSwitchComponentsPb([]string{}) and asserts no error and that result is
nil, and another t.Run titled like "inputs are trimmed and lowercased" that
calls parseNVSwitchComponentsPb with mixed-case/whitespace values (eg.
[]string{" BmC ", "CPLD", " Bios"}) and asserts no error and that the returned
slice equals the corresponding pb.NvSwitchComponent enums
(pb.NvSwitchComponent_NV_SWITCH_COMPONENT_BMC, _CPLD, _BIOS, etc.), thereby
locking in the normalization (trim+lowercase) contract of
parseNVSwitchComponentsPb.
flow/internal/task/componentmanager/powershelf/psm/components_test.go (1)

36-49: ⚡ Quick win

Add normalization coverage for component parsing.

The parser supports trim/lowercase normalization, but tests only cover already-normalized values. Add a case like []string{" PMC ", "PsU"} to lock this behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@flow/internal/task/componentmanager/powershelf/psm/components_test.go` around
lines 36 - 49, Add a new test in components_test.go to verify
parsePowershelfComponents normalizes input; call parsePowershelfComponents with
mixed-case/whitespace values like []string{" PMC ", "PsU"} and assert no error
and that it returns []psmapi.PowershelfComponent{psmapi.PowershelfComponentPMC,
psmapi.PowershelfComponentPSU}; place it as a t.Run similar to the existing
cases (e.g., "normalization: trim and lowercase") so the parser's trim/lowercase
behavior is locked in.
api/pkg/api/model/firmware_test.go (1)

97-130: ⚡ Quick win

Add test case exercising Filter.Validate() code path.

All test cases implicitly set Filter to nil. While this validates the components/version constraint, it does not exercise the Filter.Validate() call introduced at line 114 of firmware.go with a non-nil filter. Consider adding a test case with an explicit filter to improve coverage of the complete validation flow.

📋 Proposed test case
 		{
 			name:    "invalid - missing siteId",
 			request: APIBatchTrayFirmwareUpdateRequest{},
 			wantErr: true,
 		},
+		{
+			name: "valid - with filter and version",
+			request: APIBatchTrayFirmwareUpdateRequest{
+				SiteID:  "site-1",
+				Filter:  &TrayFilter{Names: []string{"tray-1"}},
+				Version: strPtr("24.11.0"),
+			},
+			wantErr: false,
+		},
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/model/firmware_test.go` around lines 97 - 130, Add a test case to
TestAPIBatchTrayFirmwareUpdateRequest_Validate that sets
APIBatchTrayFirmwareUpdateRequest.Filter to a non-nil value so the
Filter.Validate() path is exercised; create at least one case where Filter is
valid (e.g., a properly populated Filter struct) and optionally one where Filter
is invalid to assert wantErr=true, ensuring the test invokes the
Filter.Validate() logic introduced in firmware.go and references
APIBatchTrayFirmwareUpdateRequest and Filter.Validate() to locate the code to
modify.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@flow/internal/task/componentmanager/powershelf/nico/nico.go`:
- Around line 226-231: The code currently forces subComponents to PMC when
len(subComponents) == 0, breaking the intended contract that an empty
info.Components means "update all bundled components"; remove the assignment
that sets subComponents =
[]pb.PowerShelfComponent{pb.PowerShelfComponent_POWER_SHELF_COMPONENT_PMC} and
update the surrounding comment accordingly so that an empty subComponents is
preserved (or explicitly handled as "all components") downstream; reference the
subComponents variable and the constant
pb.PowerShelfComponent_POWER_SHELF_COMPONENT_PMC to locate and remove the
defaulting behavior.

In `@flow/internal/task/componentmanager/powershelf/psm/psm.go`:
- Around line 315-317: The code is forcing a PMC-only fallback by setting
subComponents to psmapi.PowershelfComponentPMC when components are empty;
instead, preserve the "all bundled components" semantics by not injecting a
PMC-only value. Remove (or disable) the block that assigns subComponents =
[]psmapi.PowershelfComponent{psmapi.PowershelfComponentPMC} and ensure
callers/handlers that consume subComponents treat an empty/nil slice as "all
components" (or use whatever existing sentinel for bundled-all) so the request
remains broad rather than narrowed to PMC-only.

In `@flow/internal/task/operations/operations.go`:
- Around line 218-225: Update the inline comment for the Components field to use
the supported compute component names instead of "uefi": change the compute
example from ["bmc","uefi"] to the documented names (e.g., ["bmc","bios"] or
["bmc","combined_bmc_uefi"]) so it stays consistent with FirmwareControl mapping
logic; locate the comment above the Components []string declaration in
operations.go and adjust the example text only.

In `@openapi/spec.yaml`:
- Around line 17154-17158: The schema currently allows arbitrary strings for
components; update the OpenAPI schemas for
BatchTrayFirmwareUpdateRequest.components and FirmwareUpdateRequest.components
so items are constrained to the allowed lowercase component names (use an enum
listing the known component values and/or a pattern to forbid empty strings)
instead of type: string; ensure both occurrences (previously at the shown
blocks) are updated to match the description and server validation.

---

Nitpick comments:
In `@api/pkg/api/model/firmware_test.go`:
- Around line 97-130: Add a test case to
TestAPIBatchTrayFirmwareUpdateRequest_Validate that sets
APIBatchTrayFirmwareUpdateRequest.Filter to a non-nil value so the
Filter.Validate() path is exercised; create at least one case where Filter is
valid (e.g., a properly populated Filter struct) and optionally one where Filter
is invalid to assert wantErr=true, ensuring the test invokes the
Filter.Validate() logic introduced in firmware.go and references
APIBatchTrayFirmwareUpdateRequest and Filter.Validate() to locate the code to
modify.

In `@flow/internal/task/componentmanager/nvlswitch/nico/components_test.go`:
- Around line 29-52: Add two tests to TestParseNVSwitchComponentsPb: one t.Run
titled like "empty slice returns nil (proto: all)" that calls
parseNVSwitchComponentsPb([]string{}) and asserts no error and that result is
nil, and another t.Run titled like "inputs are trimmed and lowercased" that
calls parseNVSwitchComponentsPb with mixed-case/whitespace values (eg.
[]string{" BmC ", "CPLD", " Bios"}) and asserts no error and that the returned
slice equals the corresponding pb.NvSwitchComponent enums
(pb.NvSwitchComponent_NV_SWITCH_COMPONENT_BMC, _CPLD, _BIOS, etc.), thereby
locking in the normalization (trim+lowercase) contract of
parseNVSwitchComponentsPb.

In `@flow/internal/task/componentmanager/powershelf/psm/components_test.go`:
- Around line 36-49: Add a new test in components_test.go to verify
parsePowershelfComponents normalizes input; call parsePowershelfComponents with
mixed-case/whitespace values like []string{" PMC ", "PsU"} and assert no error
and that it returns []psmapi.PowershelfComponent{psmapi.PowershelfComponentPMC,
psmapi.PowershelfComponentPSU}; place it as a t.Run similar to the existing
cases (e.g., "normalization: trim and lowercase") so the parser's trim/lowercase
behavior is locked in.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 63ff7052-40eb-46db-951b-3c359e35d108

📥 Commits

Reviewing files that changed from the base of the PR and between 538260d and f0a8ed1.

⛔ Files ignored due to path filters (2)
  • flow/pkg/proto/v1/flow.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/flow/protobuf/v1/flow.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
📒 Files selected for processing (23)
  • api/pkg/api/handler/rack.go
  • api/pkg/api/handler/tray.go
  • api/pkg/api/handler/util/common/common.go
  • api/pkg/api/model/firmware.go
  • api/pkg/api/model/firmware_test.go
  • flow/internal/converter/protobuf/converter.go
  • flow/internal/service/server_impl.go
  • flow/internal/task/componentmanager/compute/nico/nico.go
  • flow/internal/task/componentmanager/compute/nico/nico_test.go
  • flow/internal/task/componentmanager/nvlswitch/nico/components_test.go
  • flow/internal/task/componentmanager/nvlswitch/nico/nico.go
  • flow/internal/task/componentmanager/nvlswitch/nvswitchmanager/components_test.go
  • flow/internal/task/componentmanager/nvlswitch/nvswitchmanager/nvswitchmanager.go
  • flow/internal/task/componentmanager/powershelf/nico/components_test.go
  • flow/internal/task/componentmanager/powershelf/nico/nico.go
  • flow/internal/task/componentmanager/powershelf/psm/components_test.go
  • flow/internal/task/componentmanager/powershelf/psm/psm.go
  • flow/internal/task/operations/operations.go
  • flow/proto/v1/flow.proto
  • openapi/spec.yaml
  • sdk/standard/model_batch_tray_firmware_update_request.go
  • sdk/standard/model_firmware_update_request.go
  • workflow-schema/flow/proto/v1/flow.proto

Comment thread flow/internal/task/componentmanager/powershelf/nico/nico.go
Comment thread flow/internal/task/componentmanager/powershelf/psm/psm.go
Comment thread flow/internal/task/operations/operations.go Outdated
Comment thread openapi/spec.yaml Outdated
Comment on lines +17154 to +17158
components:
type: array
items:
type: string
description: |-

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Constrain components item values in the schema (not only in description).

components.items currently allows arbitrary strings (including empty strings), but the description and server validation require known lowercase component names and reject invalid entries. This is a contract mismatch and weakens generated SDK validation.

Suggested OpenAPI constraint update
         components:
           type: array
           items:
             type: string
+            minLength: 1
+            enum:
+              - bmc
+              - cpld
+              - bios
+              - nvos
+              - pmc
+              - psu
+              - cec
+              - nic
+              - cpld_mb
+              - cpld_pdb
+              - hgx_bmc
+              - combined_bmc_uefi
+              - gpu
+              - cx7

Apply this to both BatchTrayFirmwareUpdateRequest.components and FirmwareUpdateRequest.components.

As per coding guidelines, “Review the OpenAPI specification, check for consistency and correctness, check for misspellings and grammatical errors.”

Also applies to: 17244-17248

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openapi/spec.yaml` around lines 17154 - 17158, The schema currently allows
arbitrary strings for components; update the OpenAPI schemas for
BatchTrayFirmwareUpdateRequest.components and FirmwareUpdateRequest.components
so items are constrained to the allowed lowercase component names (use an enum
listing the known component values and/or a pattern to forbid empty strings)
instead of type: string; ensure both occurrences (previously at the shown
blocks) are updated to match the description and server validation.

@github-actions

github-actions Bot commented May 16, 2026

Copy link
Copy Markdown

Test Results

9 472 tests  +27   9 472 ✅ +27   7m 29s ⏱️ +29s
  156 suites + 1       0 💤 ± 0 
   14 files   ± 0       0 ❌ ± 0 

Results for commit ac50814. ± Comparison against base commit 199e53f.

♻️ This comment has been updated with latest results.

@github-actions

Copy link
Copy Markdown

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-05-16 00:30:51 UTC | Commit: 3537bc0

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
flow/internal/task/componentmanager/powershelf/psm/psm.go (1)

291-327: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Empty components is still narrowed to PMC-only (contract mismatch).

Line 293 and Line 315 currently enforce PMC-only behavior when components is omitted/empty. That narrows scope and conflicts with the tray API contract in this PR where omitted/empty should preserve "all bundled components" behavior.

Suggested fix
-// info.Components, when non-empty, restricts the update to the selected
-// component types (e.g. ["pmc", "psu"]). Empty defaults to ["pmc"], which
-// preserves the historical PSM behavior.
+// info.Components, when non-empty, restricts the update to the selected
+// component types (e.g. ["pmc", "psu"]). Empty means "all bundled components".
@@
-	if len(subComponents) == 0 {
-		subComponents = []psmapi.PowershelfComponent{psmapi.PowershelfComponentPMC}
-	}
-
-	componentReqs := make([]psmapi.UpdateComponentFirmwareRequest, 0, len(subComponents))
-	for _, c := range subComponents {
-		componentReqs = append(componentReqs, psmapi.UpdateComponentFirmwareRequest{
-			Component: c,
-			UpgradeTo: psmapi.FirmwareVersion{Version: info.TargetVersion},
-		})
-	}
+	var componentReqs []psmapi.UpdateComponentFirmwareRequest
+	if len(subComponents) > 0 {
+		componentReqs = make([]psmapi.UpdateComponentFirmwareRequest, 0, len(subComponents))
+		for _, c := range subComponents {
+			componentReqs = append(componentReqs, psmapi.UpdateComponentFirmwareRequest{
+				Component: c,
+				UpgradeTo: psmapi.FirmwareVersion{Version: info.TargetVersion},
+			})
+		}
+	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@flow/internal/task/componentmanager/powershelf/psm/psm.go` around lines 291 -
327, The code in FirmwareControl currently forces an empty/omitted
info.Components to PMC-only by setting subComponents =
[]psmapi.PowershelfComponent{psmapi.PowershelfComponentPMC}; remove that
fallback so an empty slice preserves the "all bundled components" contract from
the tray API. In practice: in Manager.FirmwareControl (around where
ParsePSMPowerShelf is called and subComponents is assigned) delete the block
that overrides subComponents to PMC; instead either leave subComponents empty
(so downstream psmClient semantics imply "all") or, if an explicit list is
required, populate subComponents with all psmapi.PowershelfComponent values (not
just PMC) before building componentReqs.
api/pkg/api/model/firmware.go (1)

101-104: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Documentation contradicts validation logic.

Same issue as APIUpdateFirmwareRequest.Components: the comment states "Requires Version" but validation permits empty components without a version constraint.

📝 Proposed fix for documentation
 	// Components, when non-empty, restricts the update to a subset of
 	// component types within each matched tray. Same semantics as the
-	// single-tray variant. Requires Version.
+	// single-tray variant. When non-empty, requires Version.
 	Components []string `json:"components,omitempty"`
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/model/firmware.go` around lines 101 - 104, The struct field
comment for Components in firmware.go incorrectly says "Requires Version" while
validation does not enforce that; update the comment on the Components []string
field (and similarly the comment for APIUpdateFirmwareRequest.Components if
present) to reflect actual behavior (remove the "Requires Version" phrase or
reword to state that components may be provided independently of Version) so
documentation matches validation logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/pkg/api/model/firmware.go`:
- Around line 112-114: r.Filter may be nil (it's a *TrayFilter with omitempty),
so calling r.Filter.Validate() can panic; update the code in firmware.go to
guard the call by checking if r.Filter != nil before invoking
r.Filter.Validate(), and only return the validation error when the pointer is
non-nil (otherwise skip validation or return nil); reference the r.Filter field
and the TrayFilter.Validate method when making the change.
- Around line 32-40: The comment says Components "Requires Version" but
validateFirmwareComponents currently allows Components to be non-empty without
Version; update the validation to match the documented contract: in
validateFirmwareComponents add a check that if the Firmware.Components slice is
non-empty (len(Components) > 0) then Firmware.Version must be set/non-empty,
returning a validation error otherwise; reference the Components field on the
Firmware struct and the validateFirmwareComponents function when making this
change.

---

Duplicate comments:
In `@api/pkg/api/model/firmware.go`:
- Around line 101-104: The struct field comment for Components in firmware.go
incorrectly says "Requires Version" while validation does not enforce that;
update the comment on the Components []string field (and similarly the comment
for APIUpdateFirmwareRequest.Components if present) to reflect actual behavior
(remove the "Requires Version" phrase or reword to state that components may be
provided independently of Version) so documentation matches validation logic.

In `@flow/internal/task/componentmanager/powershelf/psm/psm.go`:
- Around line 291-327: The code in FirmwareControl currently forces an
empty/omitted info.Components to PMC-only by setting subComponents =
[]psmapi.PowershelfComponent{psmapi.PowershelfComponentPMC}; remove that
fallback so an empty slice preserves the "all bundled components" contract from
the tray API. In practice: in Manager.FirmwareControl (around where
ParsePSMPowerShelf is called and subComponents is assigned) delete the block
that overrides subComponents to PMC; instead either leave subComponents empty
(so downstream psmClient semantics imply "all") or, if an explicit list is
required, populate subComponents with all psmapi.PowershelfComponent values (not
just PMC) before building componentReqs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8c891528-f652-4b54-afa2-8fb7392e0538

📥 Commits

Reviewing files that changed from the base of the PR and between f0a8ed1 and 3537bc0.

📒 Files selected for processing (10)
  • api/pkg/api/model/firmware.go
  • flow/internal/task/componentmanager/nvlswitch/nico/nico.go
  • flow/internal/task/componentmanager/nvlswitch/nvswitchmanager/nvswitchmanager.go
  • flow/internal/task/componentmanager/powershelf/nico/nico.go
  • flow/internal/task/componentmanager/powershelf/psm/psm.go
  • flow/pkg/common/firmwarecomponents/firmwarecomponents.go
  • flow/pkg/common/firmwarecomponents/firmwarecomponents_test.go
  • openapi/spec.yaml
  • sdk/standard/model_batch_tray_firmware_update_request.go
  • sdk/standard/model_firmware_update_request.go
✅ Files skipped from review due to trivial changes (2)
  • flow/pkg/common/firmwarecomponents/firmwarecomponents.go
  • sdk/standard/model_batch_tray_firmware_update_request.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • flow/internal/task/componentmanager/nvlswitch/nvswitchmanager/nvswitchmanager.go
  • sdk/standard/model_firmware_update_request.go
  • openapi/spec.yaml

Comment thread api/pkg/api/model/firmware.go Outdated
Comment thread api/pkg/api/model/firmware.go Outdated
@zhaozhongn zhaozhongn changed the title feat: Add tray-level firmware sub-component selection in Flow Tray API feat: Add tray-level firmware update target selection in Flow Tray API May 16, 2026
@zhaozhongn zhaozhongn changed the title feat: Add tray-level firmware update target selection in Flow Tray API feat: Add tray-level firmware update target selection in Tray FW Update API May 16, 2026
@kunzhao-nv kunzhao-nv force-pushed the feat/firmware-component-selection branch 2 times, most recently from e296556 to 1f318d4 Compare May 16, 2026 06:19

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
flow/internal/task/componentmanager/powershelf/psm/psm.go (1)

292-294: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove PMC-only fallback for empty SubTargets to preserve bundle-wide default.

Line 315 currently forces empty/omitted SubTargets into ["pmc"], which narrows scope and breaks the intended default semantics of updating the full bundle when selection is omitted/empty.

Proposed fix
-// firmware sub-parts (e.g. ["pmc", "psu"]). Empty defaults to ["pmc"],
-// which preserves the historical PSM behavior.
+// firmware sub-parts (e.g. ["pmc", "psu"]). Empty means "all components
+// in the bundle".
@@
 	subComponents, err := firmwarecomponents.ParsePSMPowerShelf(info.SubTargets)
 	if err != nil {
 		return err
 	}
-	if len(subComponents) == 0 {
-		subComponents = []psmapi.PowershelfComponent{psmapi.PowershelfComponentPMC}
-	}
 
 	pmcMacs := target.ComponentIDs
 
-	componentReqs := make([]psmapi.UpdateComponentFirmwareRequest, 0, len(subComponents))
+	var componentReqs []psmapi.UpdateComponentFirmwareRequest
 	for _, c := range subComponents {
 		componentReqs = append(componentReqs, psmapi.UpdateComponentFirmwareRequest{
 			Component: c,
 			UpgradeTo: psmapi.FirmwareVersion{Version: info.TargetVersion},
 		})
 	}

Also applies to: 315-317, 321-336

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@flow/internal/task/componentmanager/powershelf/psm/psm.go` around lines 292 -
294, The code currently forces an empty or omitted info.SubTargets into ["pmc"],
narrowing updates to PMC only; remove that fallback so an empty/nil
info.SubTargets is preserved (not mutated) and therefore allows the bundle-wide
default to apply. Locate the logic that assigns/normalizes info.SubTargets
(references: info.SubTargets, the PMC fallback branch around the current
defaulting code) and delete or bypass the block that sets ["pmc"] when the slice
is empty; instead leave info.SubTargets as nil/empty and ensure any callers or
downstream code treat nil/empty as meaning "apply to whole bundle" rather than a
PMC-only list. Also update the comment above the handling to reflect that empty
SubTargets defaults to bundle-wide behavior.
api/pkg/api/model/firmware.go (1)

114-122: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Nil pointer dereference on optional Filter.

The code calls r.Filter.Validate() without checking whether r.Filter is nil. Since Filter is declared as *TrayFilter with omitempty JSON tag, it can legitimately be nil in valid API requests. This will cause a runtime panic.

🛡️ Proposed fix to add nil guard
 	if r.SiteID == "" {
 		return fmt.Errorf("siteId is required")
 	}
-	if err := r.Filter.Validate(); err != nil {
-		return err
+	if r.Filter != nil {
+		if err := r.Filter.Validate(); err != nil {
+			return err
+		}
 	}
 	return validateFirmwareTargets(r.Targets, r.Version)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/model/firmware.go` around lines 114 - 122, The Validate method on
APIBatchTrayFirmwareUpdateRequest can panic by calling r.Filter.Validate() when
Filter is nil; update the APIBatchTrayFirmwareUpdateRequest.Validate function to
first check if r.Filter != nil before invoking r.Filter.Validate(), only call
Filter.Validate when present, and propagate its error if any; leave the existing
validateFirmwareTargets(r.Targets, r.Version) call unchanged so target/version
validation still runs.
🧹 Nitpick comments (1)
flow/internal/task/componentmanager/compute/nico/nico.go (1)

330-335: ⚡ Quick win

Add an explicit TODO marker for deferred compute sub-target routing.

This deferred-path note is clear, but it should be tagged with a TODO so it remains discoverable and trackable in follow-up work.

💡 Suggested patch
 	if len(info.SubTargets) > 0 {
+		// TODO: Route compute firmware updates through UpdateComponentFirmware to honor sub-target selection.
 		// SetFirmwareUpdateTimeWindow + SetMachineAutoUpdate (the path used
 		// here for compute trays) do not expose per-sub-target selection;
 		// auto-update will install whatever is in the desired bundle. Log

As per coding guidelines: Add TODO comments for features or nuances not important to implement right away.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@flow/internal/task/componentmanager/compute/nico/nico.go` around lines 330 -
335, The comment notes deferred compute sub-target routing but lacks a TODO
marker; add an explicit TODO comment above the existing note in nico.go so it is
discoverable for future work. Locate the block referencing
SetFirmwareUpdateTimeWindow, SetMachineAutoUpdate and UpdateComponentFirmware
and prepend a TODO/PENDING marker (e.g., "TODO: route per-sub-target via
UpdateComponentFirmware during version → FW Object identifier migration") so the
intended follow-up is clearly tracked and searchable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/pkg/api/model/firmware_test.go`:
- Around line 104-124: The tests panic because
APIBatchTrayFirmwareUpdateRequest.Validate() calls r.Filter.Validate() without
checking for nil; add a nil guard in APIBatchTrayFirmwareUpdateRequest.Validate
(or ensure r.Filter is non-nil before calling) so it only calls
r.Filter.Validate() when r.Filter != nil, or initialize r.Filter to an
empty/zero-value Filter inside Validate() before delegation; update the logic
around the APIBatchTrayFirmwareUpdateRequest.Validate method to safely handle
nil Filter and re-run tests.

In `@openapi/spec.yaml`:
- Around line 17167-17172: The spec currently documents compute targets as "bmc,
bios" in the ComputeTrayComponent comment but this list is incomplete; update
the OpenAPI description for ComputeTrayComponent (and the duplicate block later
in the file) to list the full, actual supported component set (e.g., include
NIC, storage, firmware-related targets or whatever the implementation supports),
correct any misspellings/grammar, and ensure both occurrences (the compute
trays/ComputeTrayComponent block and the later duplicate block) are kept
consistent so SDK/API consumers see the correct targets.

In `@workflow-schema/flow/proto/v1/flow.proto`:
- Line 21: The proto go_package was changed in flow.proto (the go_package
option) which will break imports; regenerate the Go protobuf code from
workflow-schema/flow/proto/v1/flow.proto with the new go_package value, then
update all downstream import paths that reference the old path
(github.com/NVIDIA/infra-controller-rest/workflow-schema/flow/protobuf/v1) to
the new path (github.com/NVIDIA/infra-controller-rest/flow/pkg/proto/v1) across
the 36 consumer files (site-workflow activity/workflow/grpc packages, site-agent
components, and api model/handler/utilities), run go mod tidy and full
test/compile, and submit the changes atomically so consumers compile against the
updated generated package.

---

Duplicate comments:
In `@api/pkg/api/model/firmware.go`:
- Around line 114-122: The Validate method on APIBatchTrayFirmwareUpdateRequest
can panic by calling r.Filter.Validate() when Filter is nil; update the
APIBatchTrayFirmwareUpdateRequest.Validate function to first check if r.Filter
!= nil before invoking r.Filter.Validate(), only call Filter.Validate when
present, and propagate its error if any; leave the existing
validateFirmwareTargets(r.Targets, r.Version) call unchanged so target/version
validation still runs.

In `@flow/internal/task/componentmanager/powershelf/psm/psm.go`:
- Around line 292-294: The code currently forces an empty or omitted
info.SubTargets into ["pmc"], narrowing updates to PMC only; remove that
fallback so an empty/nil info.SubTargets is preserved (not mutated) and
therefore allows the bundle-wide default to apply. Locate the logic that
assigns/normalizes info.SubTargets (references: info.SubTargets, the PMC
fallback branch around the current defaulting code) and delete or bypass the
block that sets ["pmc"] when the slice is empty; instead leave info.SubTargets
as nil/empty and ensure any callers or downstream code treat nil/empty as
meaning "apply to whole bundle" rather than a PMC-only list. Also update the
comment above the handling to reflect that empty SubTargets defaults to
bundle-wide behavior.

---

Nitpick comments:
In `@flow/internal/task/componentmanager/compute/nico/nico.go`:
- Around line 330-335: The comment notes deferred compute sub-target routing but
lacks a TODO marker; add an explicit TODO comment above the existing note in
nico.go so it is discoverable for future work. Locate the block referencing
SetFirmwareUpdateTimeWindow, SetMachineAutoUpdate and UpdateComponentFirmware
and prepend a TODO/PENDING marker (e.g., "TODO: route per-sub-target via
UpdateComponentFirmware during version → FW Object identifier migration") so the
intended follow-up is clearly tracked and searchable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e8e9fee9-22c4-4716-be49-56ceb78839de

📥 Commits

Reviewing files that changed from the base of the PR and between 3537bc0 and e296556.

⛔ Files ignored due to path filters (3)
  • flow/pkg/proto/v1/flow.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/flow/protobuf/v1/flow.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • workflow-schema/flow/protobuf/v1/flow_grpc.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
📒 Files selected for processing (21)
  • api/pkg/api/handler/rack.go
  • api/pkg/api/handler/tray.go
  • api/pkg/api/handler/util/common/common.go
  • api/pkg/api/model/firmware.go
  • api/pkg/api/model/firmware_test.go
  • flow/internal/converter/protobuf/converter.go
  • flow/internal/service/server_impl.go
  • flow/internal/task/componentmanager/compute/nico/nico.go
  • flow/internal/task/componentmanager/compute/nico/nico_test.go
  • flow/internal/task/componentmanager/nvlswitch/nico/nico.go
  • flow/internal/task/componentmanager/nvlswitch/nvswitchmanager/nvswitchmanager.go
  • flow/internal/task/componentmanager/powershelf/nico/nico.go
  • flow/internal/task/componentmanager/powershelf/psm/psm.go
  • flow/internal/task/operations/operations.go
  • flow/pkg/common/firmwarecomponents/firmwarecomponents.go
  • flow/pkg/common/firmwarecomponents/firmwarecomponents_test.go
  • flow/proto/v1/flow.proto
  • openapi/spec.yaml
  • sdk/standard/model_batch_tray_firmware_update_request.go
  • sdk/standard/model_firmware_update_request.go
  • workflow-schema/flow/proto/v1/flow.proto
🚧 Files skipped from review as they are similar to previous changes (7)
  • api/pkg/api/handler/util/common/common.go
  • api/pkg/api/handler/rack.go
  • api/pkg/api/handler/tray.go
  • flow/internal/task/componentmanager/powershelf/nico/nico.go
  • flow/internal/task/componentmanager/nvlswitch/nico/nico.go
  • flow/pkg/common/firmwarecomponents/firmwarecomponents.go
  • flow/pkg/common/firmwarecomponents/firmwarecomponents_test.go

Comment on lines +104 to +124
name: "valid - siteId only",
request: APIBatchTrayFirmwareUpdateRequest{SiteID: "site-1"},
wantErr: false,
},
{
name: "valid - targets with version",
request: APIBatchTrayFirmwareUpdateRequest{
SiteID: "site-1",
Version: strPtr("24.11.0"),
Targets: []string{"bmc"},
},
wantErr: false,
},
{
name: "invalid - targets without version",
request: APIBatchTrayFirmwareUpdateRequest{
SiteID: "site-1",
Targets: []string{"bmc"},
},
wantErr: true,
},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Test cases will panic due to nil Filter dereference.

Three test cases ("valid - siteId only" at lines 104-107, "valid - targets with version" at lines 109-116, and "invalid - targets without version" at lines 118-124) create APIBatchTrayFirmwareUpdateRequest without setting the Filter field, leaving it nil. When Validate() is invoked, the code at firmware.go:118 unconditionally calls r.Filter.Validate(), causing a nil pointer panic.

These tests cannot pass until the nil guard is added to APIBatchTrayFirmwareUpdateRequest.Validate().

🛡️ Proposed fix: update tests to set Filter or wait for validation fix

Option 1: Set a valid (or empty) Filter in each test case to avoid nil:

 		{
 			name:    "valid - siteId only",
-			request: APIBatchTrayFirmwareUpdateRequest{SiteID: "site-1"},
+			request: APIBatchTrayFirmwareUpdateRequest{SiteID: "site-1", Filter: &TrayFilter{}},
 			wantErr: false,
 		},
 		{
 			name: "valid - targets with version",
 			request: APIBatchTrayFirmwareUpdateRequest{
 				SiteID:  "site-1",
+				Filter:  &TrayFilter{},
 				Version: strPtr("24.11.0"),
 				Targets: []string{"bmc"},
 			},
 			wantErr: false,
 		},
 		{
 			name: "invalid - targets without version",
 			request: APIBatchTrayFirmwareUpdateRequest{
 				SiteID:  "site-1",
+				Filter:  &TrayFilter{},
 				Targets: []string{"bmc"},
 			},
 			wantErr: true,
 		},

Option 2 (recommended): Fix the nil guard in firmware.go first (as flagged in the separate comment), then these tests will pass as-is.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
name: "valid - siteId only",
request: APIBatchTrayFirmwareUpdateRequest{SiteID: "site-1"},
wantErr: false,
},
{
name: "valid - targets with version",
request: APIBatchTrayFirmwareUpdateRequest{
SiteID: "site-1",
Version: strPtr("24.11.0"),
Targets: []string{"bmc"},
},
wantErr: false,
},
{
name: "invalid - targets without version",
request: APIBatchTrayFirmwareUpdateRequest{
SiteID: "site-1",
Targets: []string{"bmc"},
},
wantErr: true,
},
{
name: "valid - siteId only",
request: APIBatchTrayFirmwareUpdateRequest{SiteID: "site-1", Filter: &TrayFilter{}},
wantErr: false,
},
{
name: "valid - targets with version",
request: APIBatchTrayFirmwareUpdateRequest{
SiteID: "site-1",
Filter: &TrayFilter{},
Version: strPtr("24.11.0"),
Targets: []string{"bmc"},
},
wantErr: false,
},
{
name: "invalid - targets without version",
request: APIBatchTrayFirmwareUpdateRequest{
SiteID: "site-1",
Filter: &TrayFilter{},
Targets: []string{"bmc"},
},
wantErr: true,
},
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api/pkg/api/model/firmware_test.go` around lines 104 - 124, The tests panic
because APIBatchTrayFirmwareUpdateRequest.Validate() calls r.Filter.Validate()
without checking for nil; add a nil guard in
APIBatchTrayFirmwareUpdateRequest.Validate (or ensure r.Filter is non-nil before
calling) so it only calls r.Filter.Validate() when r.Filter != nil, or
initialize r.Filter to an empty/zero-value Filter inside Validate() before
delegation; update the logic around the
APIBatchTrayFirmwareUpdateRequest.Validate method to safely handle nil Filter
and re-run tests.

Comment thread openapi/spec.yaml
syntax = "proto3";

package v1;
option go_package = "github.com/NVIDIA/infra-controller-rest/flow/pkg/proto/v1";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether the old generated package path is still referenced.
rg -n --type=go 'github.com/NVIDIA/infra-controller-rest/workflow-schema/flow'

# Inspect how many proto files now emit the new go_package path.
rg -n --type=proto 'option go_package = "github.com/NVIDIA/infra-controller-rest/flow/pkg/proto/v1"'

Repository: NVIDIA/infra-controller-rest

Length of output: 4819


Coordinate go_package migration across 36 dependent files.

Line 21 changes the generated Go import path from github.com/NVIDIA/infra-controller-rest/workflow-schema/flow/protobuf/v1 to github.com/NVIDIA/infra-controller-rest/flow/pkg/proto/v1. This breaks source compatibility across 36 files currently importing the old path:

  • site-workflow: 15 files (activity, workflow, grpc packages)
  • site-agent: 1 file (components)
  • api: 20 files (model, handler, utilities)

Regenerating protobuf code with the new go_package option will render all existing imports invalid, causing compilation failure. All downstream consumers must update their import statements atomically with this change.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@workflow-schema/flow/proto/v1/flow.proto` at line 21, The proto go_package
was changed in flow.proto (the go_package option) which will break imports;
regenerate the Go protobuf code from workflow-schema/flow/proto/v1/flow.proto
with the new go_package value, then update all downstream import paths that
reference the old path
(github.com/NVIDIA/infra-controller-rest/workflow-schema/flow/protobuf/v1) to
the new path (github.com/NVIDIA/infra-controller-rest/flow/pkg/proto/v1) across
the 36 consumer files (site-workflow activity/workflow/grpc packages, site-agent
components, and api model/handler/utilities), run go mod tidy and full
test/compile, and submit the changes atomically so consumers compile against the
updated generated package.

@kunzhao-nv kunzhao-nv requested a review from thossain-nv May 16, 2026 06:26
@github-actions

github-actions Bot commented May 16, 2026

Copy link
Copy Markdown

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
nico-flow 66 4 34 18 2 8
nico-nsm 82 2 28 43 9 0
nico-psm 67 4 35 18 2 8
nico-rest-api 100 6 53 30 3 8
nico-rest-cert-manager 65 4 34 18 1 8
nico-rest-db 66 4 34 18 2 8
nico-rest-site-agent 65 4 34 18 1 8
nico-rest-site-manager 65 4 34 18 1 8
nico-rest-workflow 67 4 35 18 2 8
TOTAL 643 36 321 199 23 64

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

Lets callers restrict a firmware update to a subset of sub-parts within
each targeted tray (e.g. just BMC + NVOS for switch trays, just PSU for
powershelf trays) instead of always installing the whole bundle.

REST surface
------------
- OpenAPI: new optional `targets` array on `FirmwareUpdateRequest` and
  `BatchTrayFirmwareUpdateRequest`. Lowercase names; empty/omitted keeps
  the historical "update everything in the bundle" behavior; unknown
  names are rejected. Requires `version` to be set when non-empty.
  Field is named `targets` (not `components`) to avoid colliding with
  carbide's tray-level "Component" resource vocabulary.
- API model: `APIUpdateFirmwareRequest.Targets`,
  `APIBatchTrayFirmwareUpdateRequest.Targets`, `validateFirmwareTargets`.
- Handlers: `tray.go` (single + batch) plumbs `apiRequest.Targets`
  through `common.ExecuteFirmwareUpdateWorkflow`. `rack.go` continues
  to pass nil (no rack-level sub-target selection - rack requests are
  fanned out to individual tray operations downstream).
- SDK regenerated.

Flow gRPC + Go internals
------------------------
- `flow/proto/v1/flow.proto`: new `repeated string sub_targets = 8;` on
  `UpgradeFirmwareRequest`. Named `sub_targets` (not `components`)
  because `OperationTargetSpec.components` already exists on the same
  request and means "tray INSTANCES to hit"; keeping both as
  `components` would have been ambiguous.
- Regenerated `flow/pkg/proto/v1/flow.pb.go` and
  `workflow-schema/flow/protobuf/v1/flow.pb.go`.
- `operations.FirmwareControlTaskInfo` gains `SubTargets []string`
  (JSON tag `sub_targets`). Populated from `req.GetSubTargets()` in
  `service/server_impl.go` and `converter/protobuf/converter.go`.
- All 5 component-manager FirmwareControl implementations consume
  `info.SubTargets`:
    - nvlswitch/nsm  -> NSM QueueUpdates
    - nvlswitch/nico -> UpdateSwitchFirmwareTarget.Components
    - powershelf/psm -> UpdatePowershelfFirmwareRequest.Components
    - powershelf/nico -> UpdatePowerShelfFirmwareTarget.Components
                        (defaults to PMC when empty, preserving prior
                         behavior)
    - compute/nico  -> currently logs a warning and applies the full
                        bundle; the SetFirmwareUpdateTimeWindow +
                        SetMachineAutoUpdate path has no per-sub-target
                        selection. Will be honored once compute moves
                        to UpdateComponentFirmware.

Centralized name -> enum mapping
--------------------------------
- New package `flow/pkg/common/firmwarecomponents` owns the
  string -> tray-type-specific component-manager-enum mapping for all
  three tray types (NSM NVSwitch, PSM PowerShelf, and the three NICo
  *Component enums). Mappings are explicit hand-written maps rather
  than auto-derived from proto, so adding a new enum value in Core
  requires a deliberate decision about its REST-facing name.
- Completeness-guard unit tests (TestNICo*MapIsComplete) fail if Core
  adds a non-UNKNOWN enum value that the mapping forgot to cover.
- Package name keeps "Component" because it is the semantic destination
  of the parse - the enums in Core are named NvSwitchComponent /
  PowerShelfComponent / ComputeTrayComponent.

Compatibility
-------------
- Proto field number 8 is new (added in this commit), so there are no
  in-flight Temporal workflows or existing gRPC consumers depending on
  any prior name; the JSON-tag/field-name choices are risk-free.
- gRPC wire format only depends on field numbers; the name `sub_targets`
  changes nothing on the wire.
- Empty/omitted `targets` is the legacy behavior, so existing callers
  are unaffected.

Example
-------
PATCH /v2/tray/{id}/firmware
  { "version": "24.11.0", "targets": ["bmc", "nvos"] }

POST /v2/tray/firmware/batch
  { "siteId": "...", "filter": {"type": "switch"},
    "version": "24.11.0", "targets": ["bmc", "nvos"] }
@kunzhao-nv kunzhao-nv force-pushed the feat/firmware-component-selection branch from 1f318d4 to bbada9f Compare May 17, 2026 06:38
Resolve nvswitch/nico FirmwareControl logging conflict (keep sub_targets
field with main's NVSwitch rename). Address review feedback: nil-safe
batch tray Filter validation, OpenAPI targets enum constraint, restore
workflow-schema flow.proto go_package, and regenerate workflow-schema
protobuf + SDK.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
@kunzhao-nv kunzhao-nv enabled auto-merge (squash) May 20, 2026 18:28
@kunzhao-nv kunzhao-nv merged commit 81a4068 into main May 20, 2026
52 checks passed
@kunzhao-nv kunzhao-nv deleted the feat/firmware-component-selection branch May 20, 2026 18:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants