feat(consumer): per-chain latency cutoff for random provider selection#2314
Open
EliasiOfir wants to merge 4 commits into
Open
feat(consumer): per-chain latency cutoff for random provider selection#2314EliasiOfir wants to merge 4 commits into
EliasiOfir wants to merge 4 commits into
Conversation
Adds an optional per-endpoint QoS latency cutoff (MaxProviderLatency, in seconds) to RPCEndpoint. During the general random/weighted provider selection, providers whose EWMA latency exceeds the cutoff are put aside via the existing ignored-providers set, so the optimizer never picks them. Includes a safety fallback: the cutoff is applied only if at least one candidate stays under the threshold; otherwise the full pool is kept so relays keep flowing when the whole pairing is slow. Cold-start providers (no QoS data) are treated as under-threshold. Static, header-selected, sticky and stateful selection paths are unaffected. No optimizer/interface changes; the setting rides on csm.rpcEndpoint. 0 (default) disables the cutoff. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds focused unit tests for filterHighLatencyProviders covering: excluding slow providers, keeping fast ones, the all-slow safety fallback, disabled cutoff (0), cold-start (no QoS) handling, preserving pre-existing ignored entries, strict ">" threshold boundary, multi-provider filtering, and a mixed cold-start/slow/fast case. Documents max-provider-latency in the rpcconsumer README and adds a commented example to config/rpcconsumer.yml. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Go tests:
- consumer_session_manager_latency_selection_test.go: drives
getValidProviderAddresses through a stub optimizer to prove the cutoff
applies only to the general random path (sticky/stateful/selected-provider
paths are unaffected).
- rpcconsumer_endpoints_test.go: verifies max-provider-latency parses from
the YAML endpoints config into RPCEndpoint (0/omitted => disabled).
Live E2E (full env: dev chain + mock-backed providers + consumer):
- scripts/pre_setups/init_eth_latency_cutoff.sh: brings up 3 ETH1 providers,
each behind its own mock RPC backend; one mock is slowed so its provider
exceeds the cutoff. Modes: cutoff | regression-disabled | regression-fallback.
- scripts/test/verify_latency_cutoff.sh: drives relays and asserts on the
lava_consumer_provider_selections metric per mode.
- scripts/test/e2e_latency_cutoff.sh: one-command runner (setup -> wait ->
verify -> teardown).
- config/eth_latency_cutoff_consumer{,_disabled}.yml: the two consumer configs
(cutoff 1.0 and disabled 0) selected by mode.
The mock RPC server is left untouched; providers use --use-static-spec with a
verification-stripped ETH1 spec so they accept the mock backend while still
proxying real (latency-bearing) relays.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
Codecov Report❌ Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 4 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an optional per-chain latency cutoff for the consumer's random provider
selection. Providers whose measured QoS latency exceeds a configurable
max-provider-latency(seconds) are put aside during general random selection,so slow providers stop receiving traffic while faster ones absorb it. Opt-in:
max-provider-latency: 0(or omitted) preserves current behaviour exactly.Behaviour
> 0): providers measured above the threshold are excluded from the general random pool.Changes
protocol/lavasession/consumer_session_manager.go— latency filtering in the general selection path.protocol/lavasession/consumer_types.go—RPCEndpoint.MaxProviderLatencyfield.protocol/rpcconsumer— parsemax-provider-latencyfrom the endpoints YAML (defaults to 0).config/— example consumer configs (enabled / disabled variants) +rpcconsumer.ymlfield.protocol/rpcconsumer/README.md— docs.scripts/.Testing
scripts/test/e2e_latency_cutoff.sh all): all 3 modes passcutoff: slow provider put aside (deltasp1=+46 p2=+54 p3(slow)=+0)regression-disabled: cutoff off → slow provider still selected (p3=+24)regression-fallback: all slow → fallback keeps full pool (+100/+100/+100)go build ./...clean; no regressions acrossx/...andprotocol/....