Skip to content

test: (MAG-1597)router provider selection count only availability logic is fragile total valid length and test coverage#2275

Open
VicSheCodes wants to merge 2 commits into
mainfrom
MAG-1597-Router-Provider-Selection-Count-Only-Availability-Logic-Is-Fragile-totalValidLength-and-Test-Coverage
Open

test: (MAG-1597)router provider selection count only availability logic is fragile total valid length and test coverage#2275
VicSheCodes wants to merge 2 commits into
mainfrom
MAG-1597-Router-Provider-Selection-Count-Only-Availability-Logic-Is-Fragile-totalValidLength-and-Test-Coverage

Conversation

@VicSheCodes

Copy link
Copy Markdown
Contributor

Description

Closes: #MAG-1597

This PR documents and validates a fragile provider-availability counting pattern around
totalValidLength in getValidProviderAddresses(...), and adds focused regression tests
for both the arithmetic mismatch and the current guard behavior.

What changed

  1. Added focused unit tests in protocol/lavasession/consumer_session_manager_test.go:
  • TestTotalValidLength_NaiveArithmeticIsWrong
  • TestGetValidProviderAddresses_TotalValidLengthGuardPreventsFalseEmpty
  • TestGetValidProviderAddresses_TotalValidLengthGuardDetectsTrueEmpty
  1. Added short doc comments and t.Logf(...) lines to those 3 tests so go test -v
    clearly shows GIVEN / calculated values / assertions.

  2. Test run commands: explicit commands to run all 3 tests:

cd /Users/victoria/lava
go test ./protocol/lavasession -run 'TestTotalValidLength_NaiveArithmeticIsWrong|TestGetValidProviderAddresses_TotalValidLengthGuardPreventsFalseEmpty|TestGetValidProviderAddresses_TotalValidLengthGuardDetectsTrueEmpty' -count=1 -v
cd /Users/victoria/lava
go test ./protocol/lavasession -run 'TestTotalValidLength_NaiveArithmeticIsWrong|TestGetValidProviderAddresses_TotalValidLengthGuardPreventsFalseEmpty|TestGetValidProviderAddresses_TotalValidLengthGuardDetectsTrueEmpty' -count=1 -v 2>&1 | grep 'consumer_session_manager_test.go:'

Most critical files to review

  • protocol/lavasession/consumer_session_manager_test.go

Validation performed

Executed focused unit tests in protocol/lavasession:

go test ./protocol/lavasession -run '^TestTotalValidLength_NaiveArithmeticIsWrong$' -count=1 -v
go test ./protocol/lavasession -run '^TestGetValidProviderAddresses_TotalValidLengthGuardPreventsFalseEmpty$' -count=1 -v
go test ./protocol/lavasession -run '^TestGetValidProviderAddresses_TotalValidLengthGuardDetectsTrueEmpty$' -count=1 -v

Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • read the contribution guide
  • included the correct type prefix in the PR title, you can find examples of the prefixes below:
  • confirmed ! in the type prefix if API or client breaking change (N/A - no breaking change)
  • targeted the main branch (or feature branch first, then merge flow to main per repo process)
  • provided a link to the relevant issue or specification
  • reviewed "Files changed" and left comments if necessary
  • included the necessary unit and integration tests (unit tests added for targeted behavior)
  • updated the relevant documentation or specification, including comments for documenting Go code
  • confirmed all CI checks have passed (pending CI run on PR)

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

…selection logic

-Router-Provider-Selection-Count-Only-Availability-Logic-Is-Fragile-totalValidLength-and-Test-Coverage?source=github-for-jira
…ount-Only-Availability-Logic-Is-Fragile-totalValidLength-and-Test-Coverage

# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds regression coverage documenting and validating the “totalValidLength := len(valid)-len(ignored)” fragility in getValidProviderAddresses(...), ensuring the overlap-recount guard prevents false-empty outcomes while still detecting true-empty cases.

Changes:

  • Added two unit tests covering the guard behavior for false-empty vs true-empty scenarios.
  • Added a unit test demonstrating why naive len(valid)-len(ignored) arithmetic can be incorrect when ignored providers don’t overlap.
  • Added verbose t.Logf(...) output and doc comments to make the scenarios clearer under go test -v.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1073 to +1076
// Limit the valid pool to 2 real providers so naive subtraction can hit zero.
csm.validAddresses = []string{pairingList[0].PublicLavaAddress, pairingList[1].PublicLavaAddress}
csm.addonAddresses = nil

Copilot AI Apr 16, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getValidProviderAddresses and getValidAddresses are documented as requiring csm to be RLocked (see consumer_session_manager.go around getValidProviderAddresses: “cs.Lock must be Rlocked here” / getValidAddresses: “assuming csm is Rlocked”). These tests call them without taking csm.lock.RLock(), and also mutate csm.validAddresses/csm.addonAddresses without a write lock, which can trigger data races (especially because UpdateAllProviders spawns an async probe goroutine). Wrap the state mutation in csm.lock.Lock() and wrap the reads/calls in csm.lock.RLock() to match the contract and keep the tests race-safe.

Copilot uses AI. Check for mistakes.
Comment on lines +1083 to +1090
naiveTotalValidLength := len(validAddresses) - len(ignoredProviders)
actualOverlap := 0
for _, address := range validAddresses {
if _, ok := ignoredProviders[address]; ok {
actualOverlap++
}
}
actualRemaining := len(validAddresses) - actualOverlap

Copilot AI Apr 16, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overlap-counting loop (actualOverlap / actualRemaining) is duplicated in multiple new tests. Consider extracting it into a small helper (e.g., countOverlap(valid, ignored)) to reduce repetition and keep the intent focused on the guard behavior being asserted.

Copilot uses AI. Check for mistakes.
Comment on lines +1121 to +1132
providerA := pairingList[0].PublicLavaAddress
providerB := pairingList[1].PublicLavaAddress
csm.validAddresses = []string{providerA, providerB}
csm.addonAddresses = nil

ignoredProviders := map[string]struct{}{
providerA: {},
providerB: {},
}

validAddresses := csm.getValidAddresses("", nil, ctx)
naiveTotalValidLength := len(validAddresses) - len(ignoredProviders)

Copilot AI Apr 16, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same locking concern as above: this test mutates csm.validAddresses/csm.addonAddresses and then calls getValidAddresses / getValidProviderAddresses without holding the RWMutex, despite those helpers being documented as requiring the caller to RLock. Please take csm.lock.Lock() for the mutations and csm.lock.RLock() around the getValidAddresses / getValidProviderAddresses calls to avoid races with the async probe goroutine started by UpdateAllProviders.

Copilot uses AI. Check for mistakes.
@codecov

codecov Bot commented Apr 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag Coverage Δ
consensus 8.96% <ø> (ø)
protocol 35.31% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@VicSheCodes VicSheCodes changed the title Mag 1597 router provider selection count only availability logic is fragile total valid length and test coverage test:[MAG-1597]:router provider selection count only availability logic is fragile total valid length and test coverage Apr 16, 2026
@VicSheCodes VicSheCodes changed the title test:[MAG-1597]:router provider selection count only availability logic is fragile total valid length and test coverage test(MAG-1597)router provider selection count only availability logic is fragile total valid length and test coverage Apr 16, 2026
@VicSheCodes VicSheCodes changed the title test(MAG-1597)router provider selection count only availability logic is fragile total valid length and test coverage test: (MAG-1597)router provider selection count only availability logic is fragile total valid length and test coverage Apr 16, 2026
@github-actions

Copy link
Copy Markdown

Test Results

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
7 files   ±0   0 ❌ ±0 

Results for commit f183f4c. ± Comparison against base commit 5921aa2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants