Skip to content
This repository was archived by the owner on Jun 2, 2026. It is now read-only.

feat: Dual-write expected inventory update and delete APIs to Flow#604

Open
kunzhao-nv wants to merge 1 commit into
mainfrom
feat/expected-update-delete-flow-dual-write
Open

feat: Dual-write expected inventory update and delete APIs to Flow#604
kunzhao-nv wants to merge 1 commit into
mainfrom
feat/expected-update-delete-flow-dual-write

Conversation

@kunzhao-nv

@kunzhao-nv kunzhao-nv commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Description

Add UpdateOnFlow and DeleteOnFlow activities for ExpectedRack, ExpectedMachine (single + batch), ExpectedPowerShelf, and ExpectedSwitch, wire them into the existing workflows after the OnSite call as best-effort follow-ups, and register them with the Temporal worker. Reuse Flow's existing PatchRack / DeleteRack / PatchComponent / DeleteComponent RPCs. Extend the activity and workflow tests to cover the new Flow paths.

Type of Change

  • Feature - New feature or functionality (feat:)
  • Fix - Bug fixes (fix:)
  • Chore - Modification or removal of existing functionality (chore:)
  • Refactor - Refactoring of existing functionality (refactor:)
  • Docs - Changes in documentation or OpenAPI schema (docs:)
  • CI - Changes in GitHub workflows. Requires additional scrutiny (ci:)
  • Version - Issuing a new release version (version:)

Services Affected

  • API - API models or endpoints updated
  • Workflow - Workflow service updated
  • DB - DB DAOs or migrations updated
  • Site Manager - Site Manager updated
  • Cert Manager - Cert Manager updated
  • Site Agent - Site Agent updated
  • Flow - Flow service updated
  • Powershelf Manager - Powershelf Manager updated
  • NVSwitch Manager - NVSwitch Manager updated

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

Add Update*OnFlow and Delete*OnFlow activities for ExpectedRack,
ExpectedMachine (single + batch), ExpectedPowerShelf, and ExpectedSwitch,
wire them into the existing workflows after the OnSite call as best-effort
follow-ups, and register them with the Temporal worker. Reuse Flow's
existing PatchRack / DeleteRack / PatchComponent / DeleteComponent RPCs.
Extend the activity and workflow tests to cover the new Flow paths.

Signed-off-by: Kun Zhao <kunzhao@nvidia.com>
@kunzhao-nv kunzhao-nv requested a review from a team as a code owner June 2, 2026 03:27
@copy-pr-bot

copy-pr-bot Bot commented Jun 2, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

This pull request implements a consistent best-effort dual-write pattern for infrastructure resource synchronization, adding Flow system updates across Expected Machine, PowerShelf, Rack, and Switch workflows. Core updates remain authoritative; Flow operations execute post-success with failures demoted to warnings, ensuring workflow resilience.

Changes

Flow Dual-Write for Expected Resources

Layer / File(s) Summary
Activity Registration for Flow Operations
site-agent/pkg/components/managers/expectedmachine/subscriber.go, site-agent/pkg/components/managers/expectedpowershelf/subscriber.go, site-agent/pkg/components/managers/expectedrack/subscriber.go, site-agent/pkg/components/managers/expectedswitch/subscriber.go
Temporal worker now registers OnFlow activities (UpdateExpectedMachineOnFlow, UpdateExpectedMachinesOnFlow, DeleteExpectedMachineOnFlow, etc.) alongside existing OnSite registrations with corresponding success logging.
Expected Machine Flow Activities
site-workflow/pkg/activity/expectedmachine.go, site-workflow/pkg/activity/expectedmachine_test.go
Implements three Flow activities: single-machine update, batch-machine update (with per-item failure tracking), and delete. Introduces shared componentToPatchRequest helper for selective patch field extraction. Tests validate request validation, nil-resource handling, and graceful skipping when Flow client is unconfigured.
Expected PowerShelf Flow Activities
site-workflow/pkg/activity/expectedpowershelf.go, site-workflow/pkg/activity/expectedpowershelf_test.go
Adds UpdateExpectedPowerShelfOnFlow and DeleteExpectedPowerShelfOnFlow with request validation and Flow client presence checks. Unit tests confirm error handling for invalid requests and no-op behavior when Flow is unavailable.
Expected Rack Flow Activities
site-workflow/pkg/activity/expectedrack.go, site-workflow/pkg/activity/expectedrack_test.go
Implements UpdateExpectedRackOnFlow with PatchRack and DeleteExpectedRackOnFlow with soft-delete semantics via DeleteRack. Tests verify non-retryable errors for nil/invalid requests and graceful skipping when Flow client is missing.
Expected Switch Flow Activities
site-workflow/pkg/activity/expectedswitch.go, site-workflow/pkg/activity/expectedswitch_test.go
Adds UpdateExpectedSwitchOnFlow and DeleteExpectedSwitchOnFlow using componentToPatchRequest for patch payload generation. Tests assert error handling for incomplete requests and graceful unavailability handling.
Workflow Best-Effort Dual-Write Integration
site-workflow/pkg/workflow/expectedmachine.go, site-workflow/pkg/workflow/expectedpowershelf.go, site-workflow/pkg/workflow/expectedrack.go, site-workflow/pkg/workflow/expectedswitch.go
All four workflows (update and delete) now invoke Flow activities after Core success. Flow failures logged as warnings; Core failures remain fatal. Updated doc comments describe dual-write best-effort semantics and remove prior TODOs.
Workflow Dual-Write Test Coverage
site-workflow/pkg/workflow/expectedmachine_test.go, site-workflow/pkg/workflow/expectedpowershelf_test.go, site-workflow/pkg/workflow/expectedrack_test.go, site-workflow/pkg/workflow/expectedswitch_test.go
Extended test suites register OnFlow activity mocks in success/failure paths. Added CoreSuccess_FlowFailure test cases per resource type to verify workflows complete successfully when Flow operations fail while Core succeeds.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 39.53% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the primary change: adding dual-write capabilities for expected inventory operations to Flow as follow-up activities.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description is comprehensive and clearly related to the changeset. It accurately summarizes the addition of UpdateOnFlow and DeleteOnFlow activities across multiple expected-inventory entities, their integration into workflows as best-effort operations, and the extension of test coverage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/expected-update-delete-flow-dual-write

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown

Test Results

9 884 tests  +51   9 884 ✅ +51   7m 38s ⏱️ +37s
  161 suites ± 0       0 💤 ± 0 
   14 files   ± 0       0 ❌ ± 0 

Results for commit e2b433a. ± Comparison against base commit d3a07c6.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
site-workflow/pkg/workflow/expectedmachine.go (1)

127-131: ⚡ Quick win

Consider dedicated ActivityOptions for the best-effort Flow follow-up.

The Flow activity is invoked with the same ctx as the authoritative Core write, so it inherits MaximumAttempts: 2 and the 2-minute StartToCloseTimeout. Since the underlying activity returns a retryable error (swe.WrapErr) on a genuine Flow RPC failure, a Flow outage will cause this best-effort step to retry and stall completion of an already-successful workflow. For a non-fatal side-effect, a child context with a tighter timeout and MaximumAttempts: 1 would bound the added latency on the hot path.

♻️ Illustrative adjustment
+	flowOpts := options
+	flowOpts.RetryPolicy = &temporal.RetryPolicy{MaximumAttempts: 1}
+	flowOpts.StartToCloseTimeout = 30 * time.Second
+	flowCtx := workflow.WithActivityOptions(ctx, flowOpts)
+
-	err = workflow.ExecuteActivity(ctx, expectedMachineManager.UpdateExpectedMachineOnFlow, request).Get(ctx, nil)
+	err = workflow.ExecuteActivity(flowCtx, expectedMachineManager.UpdateExpectedMachineOnFlow, request).Get(flowCtx, nil)
 	if err != nil {
 		logger.Warn().Err(err).Str("Activity", "UpdateExpectedMachineOnFlow").Msg("Failed to update component on Flow, Core write succeeded")
 	}

The same consideration applies to the batch update (Lines 215-218) and delete (Lines 256-259) paths, as well as the equivalent blocks in the PowerShelf, Rack, and Switch workflows.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@site-workflow/pkg/workflow/expectedmachine.go` around lines 127 - 131, The
Flow follow-up activity (expectedMachineManager.UpdateExpectedMachineOnFlow) is
being called with the parent workflow ctx so it inherits long timeouts and retry
attempts; create a derived activity context with tighter ActivityOptions (e.g.,
smaller StartToCloseTimeout and a RetryPolicy with MaximumAttempts set to 1) and
call workflow.ExecuteActivity using that child context to bound latency for
these best-effort calls; apply the same pattern to the batch update and delete
calls in expectedmachine.go and mirror it in the PowerShelf, Rack, and Switch
workflow blocks.
site-workflow/pkg/activity/expectedmachine.go (1)

685-713: ⚡ Quick win

Inconsistent handling of empty values in patch request builder.

The doc comment states that "Empty / nil values are dropped rather than sent", and fields like FirmwareVersion, Position, Description, and RackId follow this pattern (lines 700-711). However, Bmcs is unconditionally set at line 698 even when c.GetBmcs() returns nil or an empty slice.

For consistency with the documented behavior and to improve the helper's reusability, Bmcs should only be set when non-empty, matching the pattern used for other optional fields.

♻️ Apply consistent empty-value handling
 	req := &flowv1.PatchComponentRequest{
-		Id:   c.GetInfo().GetId(),
-		Bmcs: c.GetBmcs(),
+		Id: c.GetInfo().GetId(),
 	}
+	if bmcs := c.GetBmcs(); len(bmcs) > 0 {
+		req.Bmcs = bmcs
+	}
 	if fv := c.GetFirmwareVersion(); fv != "" {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@site-workflow/pkg/activity/expectedmachine.go` around lines 685 - 713, The
componentToPatchRequest helper sets Bmcs unconditionally which violates the
documented behavior of dropping empty values; modify componentToPatchRequest so
it only assigns req.Bmcs when c.GetBmcs() returns a non-nil, non-empty slice
(i.e., check the length or nil before setting), leaving req.Bmcs unset otherwise
to match how FirmwareVersion, Position, Description and RackId are handled.
site-workflow/pkg/activity/expectedmachine_test.go (1)

725-745: ⚡ Quick win

Test coverage gap: missing nil request validation test.

TestManageExpectedMachine_UpdateExpectedMachinesOnFlow tests the graceful skip behavior when the Flow client is nil or unconnected, but it doesn't include a test case for a nil request. Given that UpdateExpectedMachinesOnFlow is missing nil request validation (flagged separately), adding a test for this scenario would:

  1. Document the expected behavior
  2. Catch the missing validation once it's added
  3. Align with the test pattern used in UpdateExpectedMachineOnFlow (line 694-698) and DeleteExpectedMachineOnFlow (line 748-752)
🧪 Add nil request test case
t.Run("nil request returns non-retryable error", func(t *testing.T) {
	mm := ManageExpectedMachine{flowGrpcAtomicClient: nil}
	err := mm.UpdateExpectedMachinesOnFlow(context.Background(), nil)
	assert.Error(t, err)
})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@site-workflow/pkg/activity/expectedmachine_test.go` around lines 725 - 745,
Add a test case to TestManageExpectedMachine_UpdateExpectedMachinesOnFlow that
calls UpdateExpectedMachinesOnFlow with a nil request and asserts it returns an
error (non-retryable) to cover the missing nil-request validation; locate the
test block for TestManageExpectedMachine_UpdateExpectedMachinesOnFlow and add a
t.Run similar to the other cases that constructs a ManageExpectedMachine (e.g.,
with flowGrpcAtomicClient nil) calls UpdateExpectedMachinesOnFlow(ctx, nil) and
asserts an error is returned, matching the pattern used in
UpdateExpectedMachineOnFlow and DeleteExpectedMachineOnFlow tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@site-workflow/pkg/activity/expectedmachine.go`:
- Around line 542-590: Add a nil-request guard at the top of
UpdateExpectedMachinesOnFlow: check if request == nil (and also ensure
request.GetExpectedMachines() != nil) before calling
request.GetExpectedMachines().GetExpectedMachines(); if the check fails, log a
warning (using the existing logger) and return early (nil) to match the
defensive pattern used in UpdateExpectedMachineOnFlow and other similar methods
so the function never panics on a nil input.

In `@site-workflow/pkg/activity/expectedrack_test.go`:
- Around line 368-392: Add a test to
TestManageExpectedRack_UpdateExpectedRackOnFlow that verifies
UpdateExpectedRackOnFlow returns a non-retryable error when the request contains
an empty rack_id; specifically, add a subtest named "empty rack_id returns
non-retryable error" that constructs a ManageExpectedRack (e.g., mer :=
ManageExpectedRack{flowGrpcAtomicClient: nil}) and calls
mer.UpdateExpectedRackOnFlow with a cwssaws.ExpectedRack whose RackId.Id == ""
and asserts an error is returned. Place the new subtest alongside the existing
nil request/nil client cases so it runs as part of
TestManageExpectedRack_UpdateExpectedRackOnFlow and mirrors the style of the
DeleteExpectedRackOnFlow empty-id test.

In `@site-workflow/pkg/activity/expectedrack.go`:
- Around line 430-460: UpdateExpectedRackOnFlow currently only checks for nil
request but must also validate request.RackId is non-empty before calling
expectedRackToFlowRack; add the same validation used in
UpdateExpectedRackOnSite/DeleteExpectedRackOnFlow to return a
temporal.NewNonRetryableApplicationError with swe.ErrTypeInvalidRequest (and an
appropriate errors.New("missing rack_id") or similar) when RackId is empty, so
you avoid sending a PatchRack request with an empty ID to the Flow service.

---

Nitpick comments:
In `@site-workflow/pkg/activity/expectedmachine_test.go`:
- Around line 725-745: Add a test case to
TestManageExpectedMachine_UpdateExpectedMachinesOnFlow that calls
UpdateExpectedMachinesOnFlow with a nil request and asserts it returns an error
(non-retryable) to cover the missing nil-request validation; locate the test
block for TestManageExpectedMachine_UpdateExpectedMachinesOnFlow and add a t.Run
similar to the other cases that constructs a ManageExpectedMachine (e.g., with
flowGrpcAtomicClient nil) calls UpdateExpectedMachinesOnFlow(ctx, nil) and
asserts an error is returned, matching the pattern used in
UpdateExpectedMachineOnFlow and DeleteExpectedMachineOnFlow tests.

In `@site-workflow/pkg/activity/expectedmachine.go`:
- Around line 685-713: The componentToPatchRequest helper sets Bmcs
unconditionally which violates the documented behavior of dropping empty values;
modify componentToPatchRequest so it only assigns req.Bmcs when c.GetBmcs()
returns a non-nil, non-empty slice (i.e., check the length or nil before
setting), leaving req.Bmcs unset otherwise to match how FirmwareVersion,
Position, Description and RackId are handled.

In `@site-workflow/pkg/workflow/expectedmachine.go`:
- Around line 127-131: The Flow follow-up activity
(expectedMachineManager.UpdateExpectedMachineOnFlow) is being called with the
parent workflow ctx so it inherits long timeouts and retry attempts; create a
derived activity context with tighter ActivityOptions (e.g., smaller
StartToCloseTimeout and a RetryPolicy with MaximumAttempts set to 1) and call
workflow.ExecuteActivity using that child context to bound latency for these
best-effort calls; apply the same pattern to the batch update and delete calls
in expectedmachine.go and mirror it in the PowerShelf, Rack, and Switch workflow
blocks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e573d234-b91a-484c-a7fc-a827925828e4

📥 Commits

Reviewing files that changed from the base of the PR and between d3a07c6 and e2b433a.

📒 Files selected for processing (20)
  • site-agent/pkg/components/managers/expectedmachine/subscriber.go
  • site-agent/pkg/components/managers/expectedpowershelf/subscriber.go
  • site-agent/pkg/components/managers/expectedrack/subscriber.go
  • site-agent/pkg/components/managers/expectedswitch/subscriber.go
  • site-workflow/pkg/activity/expectedmachine.go
  • site-workflow/pkg/activity/expectedmachine_test.go
  • site-workflow/pkg/activity/expectedpowershelf.go
  • site-workflow/pkg/activity/expectedpowershelf_test.go
  • site-workflow/pkg/activity/expectedrack.go
  • site-workflow/pkg/activity/expectedrack_test.go
  • site-workflow/pkg/activity/expectedswitch.go
  • site-workflow/pkg/activity/expectedswitch_test.go
  • site-workflow/pkg/workflow/expectedmachine.go
  • site-workflow/pkg/workflow/expectedmachine_test.go
  • site-workflow/pkg/workflow/expectedpowershelf.go
  • site-workflow/pkg/workflow/expectedpowershelf_test.go
  • site-workflow/pkg/workflow/expectedrack.go
  • site-workflow/pkg/workflow/expectedrack_test.go
  • site-workflow/pkg/workflow/expectedswitch.go
  • site-workflow/pkg/workflow/expectedswitch_test.go

Comment on lines +542 to +590
// UpdateExpectedMachinesOnFlow patches multiple Expected Machine components in Flow via PatchComponent.
// Mirrors CreateExpectedMachinesOnFlow's per-item loop: per-item failures are counted
// and logged but never fail the activity, since Flow dual-write is best-effort.
func (mem *ManageExpectedMachine) UpdateExpectedMachinesOnFlow(ctx context.Context, request *cwssaws.BatchExpectedMachineOperationRequest) error {
logger := log.With().Str("Activity", "UpdateExpectedMachinesOnFlow").Logger()

logger.Info().Msg("Starting activity")

if mem.flowGrpcAtomicClient == nil {
logger.Warn().Msg("Flow client not configured, skipping Flow component update")
return nil
}

flowClient := mem.flowGrpcAtomicClient.GetClient()
if flowClient == nil {
logger.Warn().Msg("Flow client not connected, skipping Flow component update")
return nil
}

grpcServiceClient := flowClient.GrpcServiceClient()
machines := request.GetExpectedMachines().GetExpectedMachines()
successes := 0
failures := 0

// TODO(chet): Work with Flow team to add batch support so we don't have to loop here.
for _, machine := range machines {
if machine.GetId().GetValue() == "" {
logger.Warn().Msg("Skipping Expected Machine without id in batch Flow update")
failures++
continue
}
patchReq := componentToPatchRequest(expectedMachineToFlowComponent(machine))
_, err := grpcServiceClient.PatchComponent(ctx, patchReq)
if err != nil {
logger.Warn().Err(err).Str("ID", machine.GetId().GetValue()).Msg("Failed to update Expected Machine component on Flow")
failures++
} else {
successes++
}
}

logger.Info().
Int("Total", len(machines)).
Int("Succeeded", successes).
Int("Failed", failures).
Msg("Completed activity")

return nil
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Missing nil request validation could cause runtime panic.

UpdateExpectedMachinesOnFlow accesses request.GetExpectedMachines().GetExpectedMachines() at line 562 without validating that request itself is non-nil. If the activity is invoked with a nil request, this will panic.

Compare with:

  • UpdateExpectedMachineOnFlow validates request at lines 513-518
  • UpdateExpectedMachinesOnSite validates request at line 724
  • CreateExpectedMachinesOnSite validates request at line 382

Add defensive validation at the start of the method for consistency and resilience.

🛡️ Add nil request validation
 func (mem *ManageExpectedMachine) UpdateExpectedMachinesOnFlow(ctx context.Context, request *cwssaws.BatchExpectedMachineOperationRequest) error {
 	logger := log.With().Str("Activity", "UpdateExpectedMachinesOnFlow").Logger()
 
 	logger.Info().Msg("Starting activity")
+
+	if request == nil {
+		return temporal.NewNonRetryableApplicationError("received empty batch update Expected Machine request for Flow", swe.ErrTypeInvalidRequest, errors.New("nil request"))
+	}
 
 	if mem.flowGrpcAtomicClient == nil {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@site-workflow/pkg/activity/expectedmachine.go` around lines 542 - 590, Add a
nil-request guard at the top of UpdateExpectedMachinesOnFlow: check if request
== nil (and also ensure request.GetExpectedMachines() != nil) before calling
request.GetExpectedMachines().GetExpectedMachines(); if the check fails, log a
warning (using the existing logger) and return early (nil) to match the
defensive pattern used in UpdateExpectedMachineOnFlow and other similar methods
so the function never panics on a nil input.

Comment on lines +368 to +392
func TestManageExpectedRack_UpdateExpectedRackOnFlow(t *testing.T) {
t.Run("nil request returns non-retryable error", func(t *testing.T) {
mer := ManageExpectedRack{flowGrpcAtomicClient: nil}
err := mer.UpdateExpectedRackOnFlow(context.Background(), nil)
assert.Error(t, err)
})

t.Run("nil Flow client skips gracefully", func(t *testing.T) {
mer := ManageExpectedRack{flowGrpcAtomicClient: nil}
err := mer.UpdateExpectedRackOnFlow(context.Background(), &cwssaws.ExpectedRack{
RackId: &cwssaws.RackId{Id: uuid.NewString()},
RackType: uuid.NewString(),
})
assert.NoError(t, err)
})

t.Run("nil Flow client connection skips gracefully", func(t *testing.T) {
mer := ManageExpectedRack{flowGrpcAtomicClient: cClient.NewFlowGrpcAtomicClient(&cClient.FlowGrpcClientConfig{})}
err := mer.UpdateExpectedRackOnFlow(context.Background(), &cwssaws.ExpectedRack{
RackId: &cwssaws.RackId{Id: uuid.NewString()},
RackType: uuid.NewString(),
})
assert.NoError(t, err)
})
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Extend test coverage to validate empty rack_id handling.

TestManageExpectedRack_UpdateExpectedRackOnFlow currently tests nil request, nil Flow client, and unconnected Flow client scenarios, but does not verify behavior when rack_id is empty or missing. In contrast, TestManageExpectedRack_DeleteExpectedRackOnFlow (lines 401–405) includes an "empty rack_id returns non-retryable error" test case.

Once the missing rack_id validation is added to UpdateExpectedRackOnFlow (as flagged in the implementation review), add a corresponding test case to confirm that an empty rack_id returns a non-retryable error.

🧪 Proposed test case
	t.Run("empty rack_id returns non-retryable error", func(t *testing.T) {
		mer := ManageExpectedRack{flowGrpcAtomicClient: nil}
		err := mer.UpdateExpectedRackOnFlow(context.Background(), &cwssaws.ExpectedRack{
			RackId:   &cwssaws.RackId{Id: ""},
			RackType: uuid.NewString(),
		})
		assert.Error(t, err)
	})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@site-workflow/pkg/activity/expectedrack_test.go` around lines 368 - 392, Add
a test to TestManageExpectedRack_UpdateExpectedRackOnFlow that verifies
UpdateExpectedRackOnFlow returns a non-retryable error when the request contains
an empty rack_id; specifically, add a subtest named "empty rack_id returns
non-retryable error" that constructs a ManageExpectedRack (e.g., mer :=
ManageExpectedRack{flowGrpcAtomicClient: nil}) and calls
mer.UpdateExpectedRackOnFlow with a cwssaws.ExpectedRack whose RackId.Id == ""
and asserts an error is returned. Place the new subtest alongside the existing
nil request/nil client cases so it runs as part of
TestManageExpectedRack_UpdateExpectedRackOnFlow and mirrors the style of the
DeleteExpectedRackOnFlow empty-id test.

Comment on lines +430 to +460
func (mer *ManageExpectedRack) UpdateExpectedRackOnFlow(ctx context.Context, request *cwssaws.ExpectedRack) error {
logger := log.With().Str("Activity", "UpdateExpectedRackOnFlow").Logger()

logger.Info().Msg("Starting activity")

if request == nil {
return temporal.NewNonRetryableApplicationError("received empty update Expected Rack request for Flow", swe.ErrTypeInvalidRequest, errors.New("nil request"))
}

if mer.flowGrpcAtomicClient == nil {
logger.Warn().Msg("Flow client not configured, skipping Flow expected rack update")
return nil
}

grpcClient := mer.flowGrpcAtomicClient.GetClient()
if grpcClient == nil {
logger.Warn().Msg("Flow client not connected, skipping Flow expected rack update")
return nil
}
grpcServiceClient := grpcClient.GrpcServiceClient()

rack := expectedRackToFlowRack(request)
_, err := grpcServiceClient.PatchRack(ctx, &flowv1.PatchRackRequest{Rack: rack})
if err != nil {
logger.Warn().Err(err).Msg("Failed to update Expected Rack on Flow")
return swe.WrapErr(err)
}

logger.Info().Msg("Completed activity")
return nil
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add validation for required rack_id field before Flow conversion.

UpdateExpectedRackOnFlow validates only that request is non-nil but does not check whether rack_id is present or non-empty. In contrast:

  • The sibling method UpdateExpectedRackOnSite (line 256–257) explicitly validates that rack_id is not empty.
  • The peer method DeleteExpectedRackOnFlow (lines 474–476) validates that rack_id is not empty.

Without this check, an empty rack_id will propagate through expectedRackToFlowRack and yield a Flow PatchRack request with an empty ID, which will fail at the service layer with a less clear error message.

As per coding guidelines, validation should enforce every rule the wire format relies on before the conversion step.

🛡️ Proposed fix: validate rack_id presence
 func (mer *ManageExpectedRack) UpdateExpectedRackOnFlow(ctx context.Context, request *cwssaws.ExpectedRack) error {
 	logger := log.With().Str("Activity", "UpdateExpectedRackOnFlow").Logger()
 
 	logger.Info().Msg("Starting activity")
 
 	if request == nil {
 		return temporal.NewNonRetryableApplicationError("received empty update Expected Rack request for Flow", swe.ErrTypeInvalidRequest, errors.New("nil request"))
 	}
+	if request.GetRackId().GetId() == "" {
+		return temporal.NewNonRetryableApplicationError("received update Expected Rack request for Flow without required rack_id field", swe.ErrTypeInvalidRequest, errors.New("missing rack_id"))
+	}
 
 	if mer.flowGrpcAtomicClient == nil {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (mer *ManageExpectedRack) UpdateExpectedRackOnFlow(ctx context.Context, request *cwssaws.ExpectedRack) error {
logger := log.With().Str("Activity", "UpdateExpectedRackOnFlow").Logger()
logger.Info().Msg("Starting activity")
if request == nil {
return temporal.NewNonRetryableApplicationError("received empty update Expected Rack request for Flow", swe.ErrTypeInvalidRequest, errors.New("nil request"))
}
if mer.flowGrpcAtomicClient == nil {
logger.Warn().Msg("Flow client not configured, skipping Flow expected rack update")
return nil
}
grpcClient := mer.flowGrpcAtomicClient.GetClient()
if grpcClient == nil {
logger.Warn().Msg("Flow client not connected, skipping Flow expected rack update")
return nil
}
grpcServiceClient := grpcClient.GrpcServiceClient()
rack := expectedRackToFlowRack(request)
_, err := grpcServiceClient.PatchRack(ctx, &flowv1.PatchRackRequest{Rack: rack})
if err != nil {
logger.Warn().Err(err).Msg("Failed to update Expected Rack on Flow")
return swe.WrapErr(err)
}
logger.Info().Msg("Completed activity")
return nil
}
func (mer *ManageExpectedRack) UpdateExpectedRackOnFlow(ctx context.Context, request *cwssaws.ExpectedRack) error {
logger := log.With().Str("Activity", "UpdateExpectedRackOnFlow").Logger()
logger.Info().Msg("Starting activity")
if request == nil {
return temporal.NewNonRetryableApplicationError("received empty update Expected Rack request for Flow", swe.ErrTypeInvalidRequest, errors.New("nil request"))
}
if request.GetRackId().GetId() == "" {
return temporal.NewNonRetryableApplicationError("received update Expected Rack request for Flow without required rack_id field", swe.ErrTypeInvalidRequest, errors.New("missing rack_id"))
}
if mer.flowGrpcAtomicClient == nil {
logger.Warn().Msg("Flow client not configured, skipping Flow expected rack update")
return nil
}
grpcClient := mer.flowGrpcAtomicClient.GetClient()
if grpcClient == nil {
logger.Warn().Msg("Flow client not connected, skipping Flow expected rack update")
return nil
}
grpcServiceClient := grpcClient.GrpcServiceClient()
rack := expectedRackToFlowRack(request)
_, err := grpcServiceClient.PatchRack(ctx, &flowv1.PatchRackRequest{Rack: rack})
if err != nil {
logger.Warn().Err(err).Msg("Failed to update Expected Rack on Flow")
return swe.WrapErr(err)
}
logger.Info().Msg("Completed activity")
return nil
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@site-workflow/pkg/activity/expectedrack.go` around lines 430 - 460,
UpdateExpectedRackOnFlow currently only checks for nil request but must also
validate request.RackId is non-empty before calling expectedRackToFlowRack; add
the same validation used in UpdateExpectedRackOnSite/DeleteExpectedRackOnFlow to
return a temporal.NewNonRetryableApplicationError with swe.ErrTypeInvalidRequest
(and an appropriate errors.New("missing rack_id") or similar) when RackId is
empty, so you avoid sending a PatchRack request with an empty ID to the Flow
service.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant