Skip to content

starknet_transaction_prover: /health returns 503 when service is saturated#14171

Open
avi-starkware wants to merge 1 commit into
avi/prover-v3/panic-counterfrom
avi/prover-v3/saturation-health
Open

starknet_transaction_prover: /health returns 503 when service is saturated#14171
avi-starkware wants to merge 1 commit into
avi/prover-v3/panic-counterfrom
avi/prover-v3/saturation-health

Conversation

@avi-starkware

Copy link
Copy Markdown
Collaborator

Adds SaturationMonitor (shared by ProvingRpcServerImpl and
HealthLayer) that tracks whether the concurrency semaphore has been
continuously rejecting proving requests. Once that has held for the
configured window (health_max_saturated_ms, default 10s), /health
returns 503 with an opaque body so load balancers can drain the pod
before in-flight requests start failing.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

@cursor

cursor Bot commented May 24, 2026

Copy link
Copy Markdown

PR Summary

Medium Risk
Changes load-balancer health semantics for pod draining; proving path only adds monitor hooks on existing reject/accept paths, but mis-tuned thresholds could drain pods early or late.

Overview
Adds saturation-aware /health so orchestrators can drain pods when the prover stays at its concurrency limit instead of only seeing always-200.

A shared SaturationMonitor records when prove_transaction starts rejecting on the semaphore (mark_rejected) and clears that window on a successful permit (mark_accepted). HealthLayer is wired with that monitor and health_max_saturated_ms (default 10s, overridable via config / --health-max-saturated-ms / HEALTH_MAX_SATURATED_MS); after continuous rejections for that duration, GET /health returns 503 with an opaque {"status":"unhealthy","reason":"saturated"} body. Startup passes a configured HealthLayer through HTTP and HTTPS middleware stacks into ProvingRpcServerImpl, with unit tests for the monitor and health behavior.

Reviewed by Cursor Bugbot for commit 5cd174d. Bugbot is set up for automated code reviews on this repo. Configure here.

@reviewable-StarkWare

Copy link
Copy Markdown

This change is Reviewable

@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from cbd1def to e503ebd Compare May 24, 2026 16:48
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from 318c9c2 to 53381dd Compare May 24, 2026 16:48
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from e503ebd to db503b7 Compare May 26, 2026 08:43
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch 2 times, most recently from d477f5e to ef3cf0b Compare May 26, 2026 12:16
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from 1da27e9 to ac98d86 Compare May 26, 2026 12:17
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from ef3cf0b to eb8da8d Compare May 26, 2026 12:17
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from ac98d86 to e4bbbdc Compare May 26, 2026 12:58
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from eb8da8d to e084131 Compare May 26, 2026 12:58
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from e4bbbdc to 06bb59e Compare May 26, 2026 16:14
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch 2 times, most recently from 171e482 to 158a680 Compare May 26, 2026 16:47
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from 06bb59e to 0b2c8cc Compare May 26, 2026 16:47
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from 158a680 to b385d86 Compare May 26, 2026 16:59
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from 0b2c8cc to 4b1caba Compare May 26, 2026 16:59
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from b385d86 to a462e96 Compare May 27, 2026 10:01
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from 4b1caba to 05ed9b4 Compare May 27, 2026 10:01
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from a462e96 to a83176f Compare May 27, 2026 10:35
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from 05ed9b4 to 72918b7 Compare May 27, 2026 10:35
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from a83176f to b4c05a6 Compare May 27, 2026 12:55
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch 2 times, most recently from 74f4f46 to 728f22c Compare May 27, 2026 13:11
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from b4c05a6 to 2739271 Compare May 27, 2026 13:11
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from 728f22c to 966f499 Compare May 27, 2026 14:04
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from 2739271 to 89534f1 Compare May 27, 2026 14:04
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from 89534f1 to d3f1139 Compare May 27, 2026 14:20

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit d3f1139. Configure here.

Comment thread crates/starknet_transaction_prover/src/server/rpc_impl.rs
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from 646fb1e to a77477b Compare May 31, 2026 10:23
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch 2 times, most recently from d15dc19 to def7ea4 Compare June 1, 2026 08:17
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from a77477b to 8017e9e Compare June 1, 2026 08:17
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from def7ea4 to b321f22 Compare June 1, 2026 11:18
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from 8017e9e to 23ed570 Compare June 1, 2026 11:18
…rated

Adds `SaturationMonitor` (shared by `ProvingRpcServerImpl` and
`HealthLayer`) that tracks whether the concurrency semaphore has been
continuously rejecting proving requests. Once that has held for the
configured window (`health_max_saturated_ms`, default 10s), `/health`
returns 503 with an opaque body so load balancers can drain the pod
before in-flight requests start failing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@avi-starkware avi-starkware force-pushed the avi/prover-v3/panic-counter branch from 23ed570 to b7a8e8e Compare June 7, 2026 10:11
@avi-starkware avi-starkware force-pushed the avi/prover-v3/saturation-health branch from b321f22 to 5cd174d Compare June 7, 2026 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants