Skip to content

apollo_dashboard: widen l1_message_no_successes alert window to 15m#14318

Open
ShahakShama wants to merge 1 commit into
main-v0.14.3from
shahak/l1-message-no-successes-alert-15m
Open

apollo_dashboard: widen l1_message_no_successes alert window to 15m#14318
ShahakShama wants to merge 1 commit into
main-v0.14.3from
shahak/l1-message-no-successes-alert-15m

Conversation

@ShahakShama

Copy link
Copy Markdown
Collaborator

Goal: reduce false-positive pages on the L1 message scraper alert.

Since each sequencer moved to a single L1 provider, the
l1_message_no_successes alert (no successful scrapes within the
window) fires spuriously: a brief gap from the sole provider trips it
even though scraping recovers shortly after. Widening the evaluation
window from 5m to 15m tolerates these short gaps.

Change: bump the increase(l1_message_scraper_success_count[...])
range from 5m to 15m in the alert definition, and regenerate the
matching expression in dev_grafana_alerts.json.

Follow-up (not in this PR): revert to 5m once a fallback mechanism
that switches back to a specific provider per sequencer is in place.

Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com

@ShahakShama ShahakShama requested a review from dan-starkware June 3, 2026 12:20
@cursor

cursor Bot commented Jun 3, 2026

Copy link
Copy Markdown

PR Summary

Low Risk
Observability-only tuning of an existing alert threshold/window; no runtime or security behavior changes.

Overview
Widens the l1_message_no_successes alert so it only fires when there are fewer than one successful L1 message scrape over 15 minutes instead of 5 minutes. The Prometheus increase(l1_message_scraper_success_count[...]) range is updated in l1_handlers.rs and the matching rule in dev_grafana_alerts.json.

This is meant to cut false-positive pages after sequencers moved to a single L1 provider, where short provider blips can look like “no successes” in a 5m window even when scraping recovers quickly.

Reviewed by Cursor Bugbot for commit d04044d. Bugbot is set up for automated code reviews on this repo. Configure here.

@reviewable-StarkWare

Copy link
Copy Markdown

This change is Reviewable

ShahakShama commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Goal: reduce false-positive pages on the L1 message scraper alert.

Since each sequencer moved to a single L1 provider, the
`l1_message_no_successes` alert (no successful scrapes within the
window) fires spuriously: a brief gap from the sole provider trips it
even though scraping recovers shortly after. Widening the evaluation
window from 5m to 15m tolerates these short gaps.

Change: bump the `increase(l1_message_scraper_success_count[...])`
range from 5m to 15m in the alert definition, and regenerate the
matching expression in dev_grafana_alerts.json.

Follow-up (not in this PR): revert to 5m once a fallback mechanism
that switches back to a specific provider per sequencer is in place.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ShahakShama ShahakShama force-pushed the shahak/l1-message-no-successes-alert-15m branch from b7d888d to d04044d Compare June 3, 2026 12:21

@dan-starkware dan-starkware left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@dan-starkware reviewed 2 files and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on ShahakShama).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants