Skip to content

starknet_transaction_prover: proving-job duration + outcome metrics#14168

Open
avi-starkware wants to merge 1 commit into
avi/prover-v3/metricsfrom
avi/prover-v3/job-metrics
Open

starknet_transaction_prover: proving-job duration + outcome metrics#14168
avi-starkware wants to merge 1 commit into
avi/prover-v3/metricsfrom
avi/prover-v3/job-metrics

Conversation

@avi-starkware

Copy link
Copy Markdown
Collaborator

Adds Prometheus counters / histograms recorded by VirtualSnosProver for
each proving job: total count by outcome (success, validation_error,
internal_error, l1_provider_error) and end-to-end duration.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

@cursor

cursor Bot commented May 24, 2026

Copy link
Copy Markdown

PR Summary

Low Risk
Observability-only changes around existing proving flow; no auth, data, or proof logic changes beyond recording metrics after each request.

Overview
Adds Prometheus observability for each prove_transaction call: end-to-end latency, per-outcome counters, and sub-step histograms for virtual OS run and Stwo proving.

prove_transaction now delegates to prove_transaction_inner and always records prover_prove_transaction_duration_seconds (success or failure) and prover_prove_transaction_outcome_total{outcome} with a fixed label set (success, failure_validation, failure_blocked, failure_runner, failure_output_parse, failure_proving). VirtualSnosProverError::metric_outcome() maps error variants to those labels to keep cardinality bounded. run_and_prove additionally emits prover_os_run_duration_seconds and prover_stwo_prove_duration_seconds. Metric names and outcome strings live in server::metrics.

Reviewed by Cursor Bugbot for commit 931d783. Bugbot is set up for automated code reviews on this repo. Configure here.

@reviewable-StarkWare

Copy link
Copy Markdown

This change is Reviewable

@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch from 7d179cb to a0e7299 Compare May 24, 2026 16:48
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch 2 times, most recently from 5eb413f to 186e4cf Compare May 26, 2026 08:43
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch from a0e7299 to 00671e6 Compare May 26, 2026 08:43
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch from 186e4cf to 9959caa Compare May 26, 2026 12:16
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch from 00671e6 to 68683f0 Compare May 26, 2026 12:16
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch from 9959caa to 1733122 Compare May 26, 2026 12:17
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch 2 times, most recently from 00f7551 to b51be4a Compare May 26, 2026 12:58
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch from 1733122 to 4403df0 Compare May 26, 2026 12:58
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch from b51be4a to 396774b Compare May 26, 2026 16:14
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch 2 times, most recently from 3e19b68 to f196a34 Compare May 26, 2026 16:47
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch 3 times, most recently from fb01d37 to f1b98e4 Compare May 27, 2026 10:01
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch 2 times, most recently from 7c12cba to aa40e3d Compare May 27, 2026 10:35
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch 2 times, most recently from 9898133 to 6974938 Compare May 27, 2026 12:55
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch from aa40e3d to 3790635 Compare May 27, 2026 12:56
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch from 6974938 to d0b3654 Compare May 27, 2026 13:11
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch 2 times, most recently from 6834e33 to c055b5f Compare May 27, 2026 14:04
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch from d0b3654 to d84bd92 Compare May 27, 2026 14:04
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch from c055b5f to b9b156a Compare May 27, 2026 14:20
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch 2 times, most recently from b2da23d to ac34a57 Compare May 31, 2026 10:23
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch from b9b156a to 69e6de4 Compare May 31, 2026 10:23
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch from ac34a57 to 97f32a2 Compare June 1, 2026 08:17
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch 2 times, most recently from 8abd666 to bca1e04 Compare June 1, 2026 11:18
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch from 97f32a2 to 15f1abf Compare June 1, 2026 11:18
Adds Prometheus counters / histograms recorded by `VirtualSnosProver` for
each proving job: total count by outcome (`success`, `validation_error`,
`internal_error`, `l1_provider_error`) and end-to-end duration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@avi-starkware avi-starkware force-pushed the avi/prover-v3/metrics branch from bca1e04 to bc665d6 Compare June 7, 2026 10:11
@avi-starkware avi-starkware force-pushed the avi/prover-v3/job-metrics branch from 15f1abf to 931d783 Compare June 7, 2026 10:11

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 931d783. Configure here.

let prove_duration = prove_start.elapsed();
metrics::histogram!(metric_names::STWO_PROVE_DURATION_SECONDS)
.record(prove_duration.as_secs_f64());
info!(prove_duration_ms = %prove_duration.as_millis(), "Proving completed");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sub-step histograms skip failures

Medium Severity

In run_and_prove, OS_RUN_DURATION_SECONDS and STWO_PROVE_DURATION_SECONDS are recorded only after run_virtual_os and prove_virtual_snos_run succeed. When either step returns an error via ?, the elapsed time for that step is never observed, so failure-path latency is missing from sub-step histograms while the end-to-end histogram still includes it.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 931d783. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants