fix(rest-api): Check IP Usage for requested Prefix before creating or updating Instance#2347
fix(rest-api): Check IP Usage for requested Prefix before creating or updating Instance#2347hwadekar-nv wants to merge 5 commits into
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughAdds batched prefix-usage DAOs for Subnet and VpcPrefix, updates Instance create/update handlers to batch usage lookups and reject requests when requested (or net-new) interfaces would exhaust AvailableIPs, updates Get handlers to use batched usage, and adds tests/fixtures for exhausted Subnet and VPC-prefix scenarios. ChangesIP-exhaustion validation and DAO batching
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
55d3f34 to
bc8f89c
Compare
🔐 TruffleHog Secret Scan✅ No secrets or credentials found! Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉 🕐 Last updated: 2026-06-09 23:05:08 UTC | Commit: 55d3f34 |
🔍 Container Scan Summary
Per-CVE detail lives in the per-service |
bc8f89c to
d095d7e
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
rest-api/api/pkg/api/handler/instance_test.go (2)
3464-3521: ⚡ Quick winAdd mirrored IP-exhaustion tests for
UpdateInstanceHandler.This change set adds create-path exhaustion assertions, but the PR scope also updates update-path exhaustion validation. Add matching update subtests (Subnet exhausted, VPC Prefix exhausted) to keep regression coverage symmetrical.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rest-api/api/pkg/api/handler/instance_test.go` around lines 3464 - 3521, The test suite added create-path IP-exhaustion cases but missed equivalent update-path tests; add two subtests in instance_test.go that mirror the create cases but exercise UpdateInstanceHandler (use APIInstanceUpdateRequest payloads or the existing update test harness, reference UpdateInstanceHandler / test names like "test Instance update API endpoint failed when subnet IP addresses are exhausted" and "test Instance update API endpoint failed when VPC Prefix IP addresses are exhausted"), using the same fixtures (subnetExhausted, vpcPrefixExhausted, tn1, vpc9, tnu1, tnOrg) and asserting http.StatusBadRequest with respMessage matching the formatted messages used in the create tests so regression coverage is symmetric.
910-913: ⚡ Quick winUse range-over-integer loops for counting fixture rows.
Lines 910 and 1183 use C-style counting loops where
rangecan express intent directly and matches repository convention.Suggested fix
- for i := 0; i < 14; i++ { + for i := range 14 { exhaustInst := testInstanceBuildInstance(t, dbSession, fmt.Sprintf("exhaust-subnet-inst-%d", i), tn1.ID, ip.ID, st1.ID, &istExhaustFixture.ID, vpc1.ID, nil, &os1.ID, nil, cdbm.InstanceStatusReady) testInstanceBuildInstanceInterface(t, dbSession, exhaustInst.ID, &subnetExhausted.ID, nil, nil, cdbm.InterfaceStatusPending) } ... - for i := 0; i < 8; i++ { + for i := range 8 { exhaustInst := testInstanceBuildInstance(t, dbSession, fmt.Sprintf("exhaust-vpcprefix-inst-%d", i), tn1.ID, ip.ID, st1.ID, &istExhaustFixture.ID, vpc9.ID, nil, &os1.ID, nil, cdbm.InstanceStatusReady) testInstanceBuildInstanceInterface(t, dbSession, exhaustInst.ID, nil, &vpcPrefixExhausted.ID, nil, cdbm.InterfaceStatusPending) }As per coding guidelines, counting loops in
rest-api/**/*.goshould preferfor i := range nover C-style index loops.Also applies to: 1183-1186
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rest-api/api/pkg/api/handler/instance_test.go` around lines 910 - 913, Replace the C-style counting loops with range-over-integer loops: change the for i := 0; i < 14; i++ loops (the ones that call testInstanceBuildInstance and testInstanceBuildInstanceInterface using variable i and subnetExhausted) to use a range over a fixed-length container (e.g., for i := range make([]struct{}, 14)) so it matches repository convention; apply the same change to the second occurrence around the block that begins at the other referenced lines (the loop that also builds fixtures and calls testInstanceBuildInstanceInterface).Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@rest-api/api/pkg/api/handler/instance_test.go`:
- Around line 777-781: The call to testInstanceSiteBuildAllocationContraints
that creates the exhaustion fixture constraint is currently ignoring its return
value; capture the returned value (e.g., constraint or result) from
testInstanceSiteBuildAllocationContraints(al1,
cdbm.AllocationResourceTypeInstanceType, istExhaustFixture.ID,
cdbm.AllocationConstraintTypeReserved, 30, ipu) and add an explicit assertion
(e.g., assert.NotNil / assert.NoError) immediately after to fail fast if fixture
creation fails, referencing istExhaustFixture and the allocation parameters so
the test cannot proceed on a bad setup.
In `@rest-api/api/pkg/api/handler/instance.go`:
- Around line 435-447: The code calls sbDAO.GetPrefixUsage on subnets (loop over
subnets / variable subnets[i], storing subnetUsage) before performing
tenant/site/VPC ownership and scope checks, which leaks allocation state across
tenants and does unnecessary DAO work; update the logic so GetPrefixUsage is
invoked only after the per-interface ownership/readiness/scope validation for
that Subnet/VPC Prefix has passed (i.e., after the existing ownership checks in
the per-interface validation path), move the subnetUsage/exhaustion checks from
the initial subnets loop into that per-interface validation, and add a simple
in-memory cache keyed by Subnet ID (or VPC prefix ID) to avoid duplicate
sbDAO.GetPrefixUsage calls for the same ID during one request (reference
sbDAO.GetPrefixUsage, subnetUsage, subnets[i], and the per-interface validation
code paths).
---
Nitpick comments:
In `@rest-api/api/pkg/api/handler/instance_test.go`:
- Around line 3464-3521: The test suite added create-path IP-exhaustion cases
but missed equivalent update-path tests; add two subtests in instance_test.go
that mirror the create cases but exercise UpdateInstanceHandler (use
APIInstanceUpdateRequest payloads or the existing update test harness, reference
UpdateInstanceHandler / test names like "test Instance update API endpoint
failed when subnet IP addresses are exhausted" and "test Instance update API
endpoint failed when VPC Prefix IP addresses are exhausted"), using the same
fixtures (subnetExhausted, vpcPrefixExhausted, tn1, vpc9, tnu1, tnOrg) and
asserting http.StatusBadRequest with respMessage matching the formatted messages
used in the create tests so regression coverage is symmetric.
- Around line 910-913: Replace the C-style counting loops with
range-over-integer loops: change the for i := 0; i < 14; i++ loops (the ones
that call testInstanceBuildInstance and testInstanceBuildInstanceInterface using
variable i and subnetExhausted) to use a range over a fixed-length container
(e.g., for i := range make([]struct{}, 14)) so it matches repository convention;
apply the same change to the second occurrence around the block that begins at
the other referenced lines (the loop that also builds fixtures and calls
testInstanceBuildInstanceInterface).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: c21bacd8-6fb3-4938-8fae-d0e332fa9d4d
📒 Files selected for processing (2)
rest-api/api/pkg/api/handler/instance.gorest-api/api/pkg/api/handler/instance_test.go
| for i := range subnets { | ||
| subnetIDMap[subnets[i].ID] = &subnets[i] | ||
|
|
||
| // Check if subnet prefix is exhausted | ||
| subnetUsage, err := sbDAO.GetPrefixUsage(ctx, nil, &subnets[i]) | ||
| if err != nil { | ||
| logger.Error().Err(err).Msg("error getting prefix usage for Subnet") | ||
| return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to get prefix usage for Subnet", nil) | ||
| } | ||
| if subnetUsage != nil && subnetUsage.AvailableIPs > 0 && subnetUsage.AcquiredIPs >= subnetUsage.AvailableIPs { | ||
| logger.Warn().Msg(fmt.Sprintf("Ip Addresses for Subnet ID: %v specified in interfaces data in request are exhausted", subnets[i].ID)) | ||
| return cutil.NewAPIErrorResponse(c, http.StatusBadRequest, fmt.Sprintf("Ip Addresses for Subnet ID: %v specified in interfaces data in request are exhausted", subnets[i].ID), nil) | ||
| } |
There was a problem hiding this comment.
Run exhaustion checks after ownership/scope validation to avoid cross-tenant state disclosure.
At Line 439, Line 463, Line 2317, and Line 2340, GetPrefixUsage is called before tenant/site/VPC ownership checks. That allows different responses for foreign IDs (e.g., “exhausted” vs. “not owned”), leaking allocation state across tenants and doing unnecessary DAO work on resources that should be rejected first.
Move exhaustion checks into the per-interface validation path, after ownership/readiness/scope checks pass for that specific Subnet/VPC Prefix (optionally cache usage per ID to avoid duplicate DAO calls).
Also applies to: 459-471, 2316-2326, 2339-2349
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@rest-api/api/pkg/api/handler/instance.go` around lines 435 - 447, The code
calls sbDAO.GetPrefixUsage on subnets (loop over subnets / variable subnets[i],
storing subnetUsage) before performing tenant/site/VPC ownership and scope
checks, which leaks allocation state across tenants and does unnecessary DAO
work; update the logic so GetPrefixUsage is invoked only after the per-interface
ownership/readiness/scope validation for that Subnet/VPC Prefix has passed
(i.e., after the existing ownership checks in the per-interface validation
path), move the subnetUsage/exhaustion checks from the initial subnets loop into
that per-interface validation, and add a simple in-memory cache keyed by Subnet
ID (or VPC prefix ID) to avoid duplicate sbDAO.GetPrefixUsage calls for the same
ID during one request (reference sbDAO.GetPrefixUsage, subnetUsage, subnets[i],
and the per-interface validation code paths).
There was a problem hiding this comment.
@hwadekar-nv, confirmed. The exhaustion checks in both CreateInstanceHandler and UpdateInstanceHandler are now correctly positioned after the ownership, readiness, and VPC-scope guards for each Subnet and VPC Prefix — eliminating the cross-tenant state disclosure risk.
daca9d7 to
5658735
Compare
|
@coderabbitai review |
|
🌿 Preview your docs: https://nvidia-preview-pull-request-2347.docs.buildwithfern.com/infra-controller |
|
@coderabbitai review |
✅ Action performedReview finished.
|
✅ Action performedReview finished.
|
| vpUsage := vpcPrefixUsageMap[vpcPrefixID] | ||
| if vpUsage != nil && vpUsage.AvailableIPs > 0 && vpUsage.AcquiredIPs+uint64(incomingInterfaceIPs)*2 > vpUsage.AvailableIPs { | ||
| logger.Warn().Msg(fmt.Sprintf("Ip Addresses for VPC Prefix ID: %v specified in interfaces data in request are exhausted", vpcPrefixID)) | ||
| return cutil.NewAPIErrorResponse(c, http.StatusBadRequest, fmt.Sprintf("Ip Addresses for VPC Prefix ID: %v specified in interfaces data in request are exhausted", vpcPrefixID), nil) |
There was a problem hiding this comment.
Consider more specific info in the error message:
- the number of IP addresses remaining in the VPC prefix
- the number of interfaces specified in the request (N)
- the number of IP addresses required to create those specified interfaces (2N)
That would help the user better define their interfaces in a way that makes use of the remaining IP space in the prefix if they wish to do so.
Something similar might be helpful for subnets, but those of course are more straightforward in that one interface uses one IP address.
There was a problem hiding this comment.
Thanks @nvlitagaki , since we are raising error for IP Address exhaust, the number of IP addresses required to create those specified interfaces (2N) could be helpful or just return the current usage along with it? cc @thossain-nv
5658735 to
85369bf
Compare
| } | ||
|
|
||
| idb := db.GetIDB(tx, ssd.dbSession) | ||
| ifcCounts, ifcIPs, err := queryEthernetInterfaceIPsForSubnets(ctx, idb, subnetIDs) |
There was a problem hiding this comment.
queryEthernetInterfaceIPsForSubnets is only being called once, let's inline this.
| // queryEthernetInterfaceIPsForSubnet returns iface row count and, for interfaces with IPs, | ||
| // each interface's assigned addresses. COUNT(*) equals len(rows) for the same join/filter on one SELECT. | ||
| func queryEthernetInterfaceIPsForSubnet(ctx context.Context, idb bun.IDB, subnetID uuid.UUID) (ifaceRows int64, ipStrings [][]string, err error) { | ||
| func subnetIPv4CIDR(sn *Subnet) (string, bool) { |
There was a problem hiding this comment.
Make this a GetIPv4CIDR receiver function of Subnet. It should return *string, nil if we can't formulate the CIDR. Let's avoid using free hanging functions which is a telltale sign of AI generated code.
| return nil, err | ||
| } | ||
|
|
||
| func subnetPrefixUsageFromInterfaces(ctx context.Context, cidr string, ifcCount int64, ips [][]string) (*cipam.Usage, error) { |
There was a problem hiding this comment.
I don't think we need ifcCount, it should be the same is len(ips). Also why is ips array of array? Let's flatten them.
| // and, for each row with assigned IPs, a slice of that interface's IPv4 addresses. | ||
| // One SELECT suffices: COUNT(*) equals the number of result rows given the same join/filter. | ||
| func queryEthernetInterfaceIPsForVPCPrefix(ctx context.Context, idb bun.IDB, vpcPrefixID uuid.UUID) (ifaceRows int64, ipStrings [][]string, err error) { | ||
| func vpcPrefixCIDR(vp *VpcPrefix) (string, bool) { |
There was a problem hiding this comment.
Same comments as above.
… updating Instance
689b522 to
1948819
Compare
Description
Type of Change
Testing
Additional Notes