test: (MAG-1597)router provider selection count only availability logic is fragile total valid length and test coverage#2275
Conversation
…selection logic -Router-Provider-Selection-Count-Only-Availability-Logic-Is-Fragile-totalValidLength-and-Test-Coverage?source=github-for-jira
…ount-Only-Availability-Logic-Is-Fragile-totalValidLength-and-Test-Coverage # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
There was a problem hiding this comment.
Pull request overview
Adds regression coverage documenting and validating the “totalValidLength := len(valid)-len(ignored)” fragility in getValidProviderAddresses(...), ensuring the overlap-recount guard prevents false-empty outcomes while still detecting true-empty cases.
Changes:
- Added two unit tests covering the guard behavior for false-empty vs true-empty scenarios.
- Added a unit test demonstrating why naive
len(valid)-len(ignored)arithmetic can be incorrect when ignored providers don’t overlap. - Added verbose
t.Logf(...)output and doc comments to make the scenarios clearer undergo test -v.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Limit the valid pool to 2 real providers so naive subtraction can hit zero. | ||
| csm.validAddresses = []string{pairingList[0].PublicLavaAddress, pairingList[1].PublicLavaAddress} | ||
| csm.addonAddresses = nil | ||
|
|
There was a problem hiding this comment.
getValidProviderAddresses and getValidAddresses are documented as requiring csm to be RLocked (see consumer_session_manager.go around getValidProviderAddresses: “cs.Lock must be Rlocked here” / getValidAddresses: “assuming csm is Rlocked”). These tests call them without taking csm.lock.RLock(), and also mutate csm.validAddresses/csm.addonAddresses without a write lock, which can trigger data races (especially because UpdateAllProviders spawns an async probe goroutine). Wrap the state mutation in csm.lock.Lock() and wrap the reads/calls in csm.lock.RLock() to match the contract and keep the tests race-safe.
| naiveTotalValidLength := len(validAddresses) - len(ignoredProviders) | ||
| actualOverlap := 0 | ||
| for _, address := range validAddresses { | ||
| if _, ok := ignoredProviders[address]; ok { | ||
| actualOverlap++ | ||
| } | ||
| } | ||
| actualRemaining := len(validAddresses) - actualOverlap |
There was a problem hiding this comment.
The overlap-counting loop (actualOverlap / actualRemaining) is duplicated in multiple new tests. Consider extracting it into a small helper (e.g., countOverlap(valid, ignored)) to reduce repetition and keep the intent focused on the guard behavior being asserted.
| providerA := pairingList[0].PublicLavaAddress | ||
| providerB := pairingList[1].PublicLavaAddress | ||
| csm.validAddresses = []string{providerA, providerB} | ||
| csm.addonAddresses = nil | ||
|
|
||
| ignoredProviders := map[string]struct{}{ | ||
| providerA: {}, | ||
| providerB: {}, | ||
| } | ||
|
|
||
| validAddresses := csm.getValidAddresses("", nil, ctx) | ||
| naiveTotalValidLength := len(validAddresses) - len(ignoredProviders) |
There was a problem hiding this comment.
Same locking concern as above: this test mutates csm.validAddresses/csm.addonAddresses and then calls getValidAddresses / getValidProviderAddresses without holding the RWMutex, despite those helpers being documented as requiring the caller to RLock. Please take csm.lock.Lock() for the mutations and csm.lock.RLock() around the getValidAddresses / getValidProviderAddresses calls to avoid races with the async probe goroutine started by UpdateAllProviders.
Codecov Report✅ All modified and coverable lines are covered by tests.
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Description
Closes: #MAG-1597
This PR documents and validates a fragile provider-availability counting pattern around
totalValidLengthingetValidProviderAddresses(...), and adds focused regression testsfor both the arithmetic mismatch and the current guard behavior.
What changed
protocol/lavasession/consumer_session_manager_test.go:TestTotalValidLength_NaiveArithmeticIsWrongTestGetValidProviderAddresses_TotalValidLengthGuardPreventsFalseEmptyTestGetValidProviderAddresses_TotalValidLengthGuardDetectsTrueEmptyAdded short doc comments and
t.Logf(...)lines to those 3 tests sogo test -vclearly shows GIVEN / calculated values / assertions.
Test run commands: explicit commands to run all 3 tests:
Most critical files to review
protocol/lavasession/consumer_session_manager_test.goValidation performed
Executed focused unit tests in
protocol/lavasession:Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!in the type prefix if API or client breaking change (N/A - no breaking change)mainbranch (or feature branch first, then merge flow tomainper repo process)Reviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...