Add_run_and_prove_recursive_tree#348
Conversation
Adds a new crate (crates/stwo_run_and_prove_recursive_tree) that folds an entire applicative recursive proof tree above N leaf STWO proofs into a single root proof in one binary invocation. Used by SHARP's GPS prover when the parent CairoJob's spec is `StwoInBinaryRecursiveTree` (gated by `enable_in_binary_recursive_tree`), to skip O(n) per-node bootloader round-trips through the GPS pipeline. Crate (876 LOC across lib.rs/main.rs/tests.rs): - `stwo_run_and_prove_recursive_tree()` performs layered 2-to-1 reductions: at each layer pairs of children are verified by a `no-builtin-simulation` simple-bootloader run (Cairo1Executable verifier tasks); odd entries are carried unchanged to the next layer. Outputs the single root proof, its flat program output, its root fact_topologies, and a nested `CompositePackedOutput` JSON for the on-chain unpacker bootloader. - `LeafInput` carries each leaf's proof_path, fact_topology, outputs, Python-computed per-leaf stats (`n_non_recursive_jobs` etc.), and a pre-computed `PackedOutput` so the binary doesn't have to know about whether each leaf is recursive — Python's `packed_output_from_data` handles that and the binary propagates it verbatim. - `LayerEntry.fact_topologies` mirrors what `stwo_run_and_prove` would have written (per-task entries from the bootloader hint), kept on the layer entry for the root disk write only — children's per-task fact_topologies in `CompositePackedOutput` come from the per-layer fact_topologies file written by the bootloader, not from a fabricated trivial value. - `single_page=true` is set on every reduce_pair's bootloader input. cairo-program-runner-lib (`hints/types.rs`): - Adds `Serialize` to `PackedOutput`/`CompositePackedOutput` with `#[serde(tag = "type")]` discriminator matching Python's marshmallow `PackedOutputSchema` (`PlainPackedOutput`/`CompositePackedOutput` rename pairs). - Adds `felt_decimal_vec::serialize` so `CompositePackedOutput.outputs` emits decimal strings (Python's `marshmallow.fields.Integer` chokes on hex and on numbers exceeding the JSON safe-integer range). Wiring: - Cargo workspace registers the new crate. - CI workflow uploads `target/release/stwo_run_and_prove_recursive_tree` to the GCS bucket `stwo_run_and_prove_recursive_tree`.
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (15.84%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #348 +/- ##
==========================================
- Coverage 53.16% 51.65% -1.51%
==========================================
Files 35 37 +2
Lines 5295 5606 +311
==========================================
+ Hits 2815 2896 +81
- Misses 2480 2710 +230
🚀 New features to boost your workflow:
|
noam-starkware
left a comment
There was a problem hiding this comment.
@noam-starkware made 25 comments.
Reviewable status: 0 of 9 files reviewed, 24 unresolved discussions (waiting on noaov1 and YairVaknin-starkware).
a discussion (no related file):
Please add doc strings to functions that don't have one.
crates/stwo_run_and_prove_recursive_tree/src/tests.rs line 1 at r1 (raw file):
use std::path::PathBuf;
You mentioned you're writing more tests, can you elaborate? Both what unit tests you're adding, and are you planning on adding a bigger flow or e2e test?
crates/stwo_run_and_prove_recursive_tree/src/lib.rs line 1 at r1 (raw file):
//! In-binary recursive proof tree builder.
I read online about comment styles and I saw that //! is sometimes used for crate level doc, but seeing as we don't use it anywhere else in the repo, don't you think we should use /// here for conformity?
crates/stwo_run_and_prove_recursive_tree/src/lib.rs line 11 at r1 (raw file):
//! per-leaf pieces. //! //! The layered reduction is purely sequential: each layer's two-to-one bootloader call consumes
Please specify that the topology of the proof tree is that of a complete tree.
crates/stwo_run_and_prove_recursive_tree/src/lib.rs line 88 at r1 (raw file):
/// Hash function the verifier program should use when hashing the verified program. /// One of "pedersen", "poseidon", "blake". (Currently `verifier_task_for_child` hardcodes /// `"blake"` matching Python's `get_recursive_program_hash_function()`; this field is
Is there a way to add some unit test that ensures that sharp and this lib use the same hash function? Could also be a on the sharp side if easier.
Code quote:
/// One of "pedersen", "poseidon", "blake". (Currently `verifier_task_for_child` hardcodes
/// `"blake"` matching Python's `get_recursive_program_hash_function()`; this field iscrates/stwo_run_and_prove_recursive_tree/src/lib.rs line 97 at r1 (raw file):
/// the full nested Composite tree describing the leaf's own aggregation. The Rust binary /// propagates this verbatim as the LayerEntry's `packed_output` so that the resulting /// root `PackedOutput` matches what the standard layered flow would have produced.
You refer to this elsewhere as legacy flow, please pick a consistent term.
Code quote:
standard layered flowcrates/stwo_run_and_prove_recursive_tree/src/lib.rs line 105 at r1 (raw file):
/// For non-recursive leaves these equal the raw fact-topology values; for recursive leaves /// Python reads the cumulative totals from `CairoJobReceived` (which already aggregated the /// entire sub-tree), so passing them here is the only way to get the correct aggregation.
Delete.
Code quote:
so passing them here is the only way to get the correct aggregationcrates/stwo_run_and_prove_recursive_tree/src/lib.rs line 109 at r1 (raw file):
pub total_non_recursive_output_size: u64, pub total_n_pages: u64, pub total_fact_tree_structures_len: u64,
These fields reappear in the RecursiveJobData (along with outputs). Is there a (simple, nice) way to avoid this duplication?
Code quote:
pub n_non_recursive_jobs: u64,
pub total_non_recursive_output_size: u64,
pub total_n_pages: u64,
pub total_fact_tree_structures_len: u64,crates/stwo_run_and_prove_recursive_tree/src/lib.rs line 116 at r1 (raw file):
/// that the root entry's `RecursiveJobData` is the correct sum across all leaves underneath it. #[derive(Debug, Clone, Default, Serialize)] pub struct RecursiveJobData {
Why doesn't this struct contain the fact topologies dict?
IIUC - you have the data to construct it in the LayerEntry objects (and you get it in LeafEntry for the leaves), and you construct it recursively during the tree building phase. Can't you put it here and keep the same struct as in the main repo?
More IIUC - I know that the dict is mapped according to cairo job ids which you obviously don't have here, but maybe you can create some form of identification based on the tree node?
What is the benefit of having the fact topology in the LayerEntry instead of here?
crates/stwo_run_and_prove_recursive_tree/src/lib.rs line 158 at r1 (raw file):
/// In-memory representation of a single tree node during reduction. At layer 0 these wrap the /// leaves; at higher layers each entry is the result of folding two children. struct LayerEntry {
I don't understand this name. If it represents a single tree node, why not call it TreeNode or something similar? When I first skimmed this PR I assumed this struct represented something for a whole layer in the tree...
Code quote:
LayerEntrycrates/stwo_run_and_prove_recursive_tree/src/lib.rs line 169 at r1 (raw file):
fact_topologies: Vec<FactTopology>, outputs: Vec<Felt252>, aggregated: RecursiveJobData,
Is this a good name for this field? I feel like it's confusing (especially given our current use of the word aggregator). Why not simplyrecursive_job_data?
Code quote:
aggregatedcrates/stwo_run_and_prove_recursive_tree/src/lib.rs line 229 at r1 (raw file):
info!( n_leaves = leaves.len(), leaf_train_ids = ?leaves.iter().map(|l| l.train_id).collect::<Vec<_>>(),
leaf.
Code quote:
lcrates/stwo_run_and_prove_recursive_tree/src/lib.rs line 229 at r1 (raw file):
info!( n_leaves = leaves.len(), leaf_train_ids = ?leaves.iter().map(|l| l.train_id).collect::<Vec<_>>(),
leaves_train_ids.
Code quote:
leaf_train_idscrates/stwo_run_and_prove_recursive_tree/src/lib.rs line 236 at r1 (raw file):
// Dropped (and removed from disk) when this function returns; the only files that outlive // the invocation are the four root-layer outputs the caller asked for. let scratch = TempDir::new()?;
How about adding the option to save temp files in case of failure for debugging? Obviously a TODO and can be postponed to post-MVP.
crates/stwo_run_and_prove_recursive_tree/src/lib.rs line 269 at r1 (raw file):
.collect::<Result<Vec<_>, RecursiveTreeError>>()?; let mut layer_idx: usize = 0;
Any reason not start indexing at 1 and avoid the addition in the call to reduce_pair?
Code quote:
let mut layer_idx: usize = 0crates/stwo_run_and_prove_recursive_tree/src/lib.rs line 315 at r1 (raw file):
let root = current_layer .pop() .expect("layer.len() == 1 after the loop");
Consider replacing with something more informative: vector should contain a single root entry after tree building process or something like that.
Code quote:
"layer.len() == 1 after the loop"crates/stwo_run_and_prove_recursive_tree/src/lib.rs line 345 at r1 (raw file):
let _span = span!(Level::INFO, "reduce_pair", layer_idx, pair_idx).entered(); let pair_dir = scratch_dir.join(format!("layer_{layer_idx:03}_pair_{pair_idx:03}"));
I know this seems excessive, but I think you should add a check that the number of leaves is below 1k to ensure no collisions.
crates/stwo_run_and_prove_recursive_tree/src/lib.rs line 470 at r1 (raw file):
} fn read_outputs_file(path: &PathBuf) -> Result<Vec<Felt252>, RecursiveTreeError> {
Add unit tests for this function.
crates/stwo_run_and_prove_recursive_tree/src/lib.rs line 524 at r1 (raw file):
/// JSON's safe-integer range. Used by `RecursiveJobData.outputs` for the (Serialize-only) debug /// dump from `main.rs`. mod felt_decimal_vec {
Is there any differene between this function and the one defined in the program runner lib? If not, can you use one shared function (maybe in some util file)?
crates/stwo_run_and_prove_recursive_tree/Cargo.toml line 9 at r1 (raw file):
description = "Fold an entire applicative recursive proof tree above its leaves into a single STWO root proof, in one binary invocation." # Explicit [[bin]] so the produced binary file name uses underscores; this is what Python's
Nice!
.github/workflows/upload_artifacts_workflow.yml line 6 at r1 (raw file):
push: branches: - yairv/add_run_and_prove_recursive_tree
Blocking till reverted to main
crates/cairo-program-runner-lib/src/hints/types.rs line 599 at r1 (raw file):
/// Serializes `Vec<Felt252>` as decimal strings so Python's `marshmallow.fields.Integer` can /// parse values larger than JSON's safe integer range without precision loss.
Why does the JSON have a valid range? Can you give a more thorough doc here?
crates/cairo-program-runner-lib/src/hints/types.rs line 600 at r1 (raw file):
/// Serializes `Vec<Felt252>` as decimal strings so Python's `marshmallow.fields.Integer` can /// parse values larger than JSON's safe integer range without precision loss. mod felt_decimal_vec {
Please add unit test for this (especially for the large values mentioned in the doc).
crates/cairo-program-runner-lib/src/hints/types.rs line 601 at r1 (raw file):
/// parse values larger than JSON's safe integer range without precision loss. mod felt_decimal_vec { use cairo_vm::Felt252;
I think the convention throughout the crate is to add imports at the top level. Any reason to do this here inside the module?
crates/stwo_run_and_prove_recursive_tree/src/main.rs line 118 at r1 (raw file):
#[derive(serde::Deserialize)] struct LeavesFile {
Struct seems redundant just for its below. Is this necessary for the serde trait or any other reason to keep it?
Adds a new crate (crates/stwo_run_and_prove_recursive_tree) that folds an entire applicative recursive proof tree above N leaf STWO proofs into a single root proof in one binary invocation. Used by SHARP's GPS prover when the parent CairoJob's spec is
StwoInBinaryRecursiveTree(gated byenable_in_binary_recursive_tree), to skip O(n) per-node bootloader round-trips through the GPS pipeline.Crate (876 LOC across lib.rs/main.rs/tests.rs):
stwo_run_and_prove_recursive_tree()performs layered 2-to-1 reductions: at each layer pairs of children are verified by ano-builtin-simulationsimple-bootloader run (Cairo1Executable verifier tasks); odd entries are carried unchanged to the next layer. Outputs the single root proof, its flat program output, its root fact_topologies, and a nestedCompositePackedOutputJSON for the on-chain unpacker bootloader.LeafInputcarries each leaf's proof_path, fact_topology, outputs, Python-computed per-leaf stats (n_non_recursive_jobsetc.), and a pre-computedPackedOutputso the binary doesn't have to know about whether each leaf is recursive — Python'spacked_output_from_datahandles that and the binary propagates it verbatim.LayerEntry.fact_topologiesmirrors whatstwo_run_and_provewould have written (per-task entries from the bootloader hint), kept on the layer entry for the root disk write only — children's per-task fact_topologies inCompositePackedOutputcome from the per-layer fact_topologies file written by the bootloader, not from a fabricated trivial value.single_page=trueis set on every reduce_pair's bootloader input.cairo-program-runner-lib (
hints/types.rs):SerializetoPackedOutput/CompositePackedOutputwith#[serde(tag = "type")]discriminator matching Python's marshmallowPackedOutputSchema(PlainPackedOutput/CompositePackedOutputrename pairs).felt_decimal_vec::serializesoCompositePackedOutput.outputsemits decimal strings (Python'smarshmallow.fields.Integerchokes on hex and on numbers exceeding the JSON safe-integer range).Wiring:
target/release/stwo_run_and_prove_recursive_treeto the GCS bucketstwo_run_and_prove_recursive_tree.Type
Description
Breaking changes?
This change is