feat(sync): request-driven pre_confirmed polling#3694
Conversation
|
Claude finished @RafaelGranza's task in 5m 2s —— View job PR ReviewReviewed against Important / worth a closer look
Nits
Looks good
Inline comments posted on the specific lines. |
| preLatestCacheSize = 10 | ||
| // Per-fetch cap so a hung feeder can't hold the running guard for the | ||
| // feeder client's full retry budget (~20s). TODO: consider exposing as flag. | ||
| preConfirmedFetchTimeout = 2 * time.Second |
There was a problem hiding this comment.
Hardcoded for now. Worth exposing as a flag, or fine to keep at 2s?
| numCallsPreConfirmed atomic.Uint32 | ||
| numCallsPending atomic.Uint32 |
There was a problem hiding this comment.
This is to avoid race conditions during tests
| preLatestCacheSize = 10 | ||
| // Per-fetch cap so a hung feeder can't hold the running guard for the | ||
| // feeder client's full retry budget (~20s). TODO: consider exposing as flag. | ||
| preConfirmedFetchTimeout = 2 * time.Second |
There was a problem hiding this comment.
Question — 2s timeout vs single feeder attempt.
The feeder client (clients/feeder/feeder.go) sets HTTP timeouts that grow under load (mediumGrowThreshold / IncreaseTimeout). With preConfirmedPollInterval defaulting to 500ms, the cap of 2s is fine for the steady-state case, but a single HTTP attempt that's already paying an increased timeout (say >1s) plus the backoff before the retry will not complete inside 2s. The comment frames this as "can't hold the running guard for the feeder client's full retry budget (~20s)" — but in practice the cap also cancels any retry path, which means under a slow feeder we'll silently never recover within a single fetchPreConfirmed call. The next trigger/tick will restart the attempt from scratch.
That may be the intent (you'd rather retry-from-scratch on the next 500ms tick than wait), but worth being explicit in the comment, and worth confirming this is acceptable on a saturated/sluggish feeder. Two thoughts:
- Consider tying the timeout to
max(preConfirmedPollInterval*N, floor)so the cap scales with the configured cadence — a node configured with--preconfirmed-poll-interval=5sprobably doesn't want a 2s cap. - Optionally surface as a flag (the existing TODO hints at this).
There was a problem hiding this comment.
IMO, this is NOT a problem, since the ticker will keep trying if the request failed.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## thiagodeev/feeder-preconfirmed-optimization #3694 +/- ##
============================================================================
Coverage 76.51% 76.51%
============================================================================
Files 402 402
Lines 36952 36940 -12
============================================================================
- Hits 28273 28264 -9
+ Misses 6697 6685 -12
- Partials 1982 1991 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Claude finished @RafaelGranza's task in 3m 13s —— View job PR Review (incremental — only the new commit since prior review)
Only one new commit since the prior review: New finding (this push)Nit — Re: author responses on prior findings
Looks good
|
|
Claude finished @RafaelGranza's task in 3m 3s —— View job PR Review (incremental — only the new commit since prior review)
Only one new commit since the prior review: New findingsNone. The synctest rewrite is sound and directly addresses the flake nit from the prior review. Tracing the scenario through the bubble:
Two minor things to note (not new — context for the rewrite):
Re: prior-review items still open
Looks good
|
|
Claude finished @RafaelGranza's task in 2m 48s —— View job PR Review (incremental — only the new commits since prior review)
Two new commits since the last incremental review (which covered through
New findingsNit — That's the only new code-level finding this push. The changes in Re: prior-review items
Minor observations on the rename
Looks good
|
|
Claude finished @RafaelGranza's task in 32s —— View job PR Review (incremental — only the new commit since prior review)
Only one new commit since the last incremental review (which covered through
New findingsNone. This commit does exactly one thing: it removes the stale four-line ✅ Resolves prior nit: stale rename doc comment. Status of prior-review items (for reference)
Nothing blocks here. The PR is in good shape. |
| if s.preConfirmedFetching.Load() { | ||
| return | ||
| } |
There was a problem hiding this comment.
Mmm, I don't like this very much. You're trying to avoid multiple requests but I suspect there might be a better way.
| preConfirmedPollIntervalUsage = "Sets how frequently pre_confirmed block will be updated" + | ||
| "(0s will disable fetching of pre_confirmed block)." | ||
| preConfirmedPollIntervalUsage = "Fallback ticker interval for pre_confirmed polling when no " + | ||
| "RPC traffic is driving refreshes. (0s disables polling.)" |
There was a problem hiding this comment.
The original description is better, please switch it back
| preLatestCacheSize = 10 | ||
| // Per-fetch cap scales with poll interval so slower configs get | ||
| // proportionally longer budgets. Floor avoids pathologically short caps. | ||
| preConfirmedFetchTimeoutMultiplier = 4 |
There was a problem hiding this comment.
We have to work in your text, I have full context of this PR and this text seems unnecessarily complex to understand.
Why not something as:
// Fetch timeout = max(pollInterval * multiplier, floor).
| func (s *Synchronizer) preConfirmedFetchTimeout() time.Duration { | ||
| return max(s.preConfirmedPollInterval*preConfirmedFetchTimeoutMultiplier, preConfirmedFetchTimeoutFloor) | ||
| } | ||
|
|
There was a problem hiding this comment.
Additionally your constants are defined here, why not defined them just here and putting the comment here?
Summary
PreConfirmed()now triggers a freshpre_confirmedfetch instead of waiting for the next ticker.--preconfirmed-poll-intervalticker is now a fallback when there is no RPC traffic to drive fetches.Benchmark
Numbers come from
BenchmarkPreConfirmedUpdateFrequency: a 5-second simulation where one RPC call hitsPreConfirmed()every 50ms (mimicking a client, or more than one, dictating a higher request frequency).Run:
go test -bench=PreConfirmedUpdateFrequency -run=^$ ./sync/The running guard caught 44-78% of triggers as duplicates, so it was not a feeder hammer.