New public statements design#514
Conversation
robknight
left a comment
There was a problem hiding this comment.
Looks good overall!
There's an off-by-one which needs to be fixed but the other issues are less important, and even if they're valid we could choose to tidy them up later.
| } else { | ||
| 0 | ||
| } + self.statements.iter().filter(|(public, _)| *public).count(); | ||
| if public_count > 2 << BASE_PARAMS.max_depth_public_statements_mt { |
There was a problem hiding this comment.
2 << BASE_PARAMS.max_depth_public_statements_mt is 2048, as we're shifting from 2^1 to 2^11. The 2 should be 1, or use 2.pow(BASE_PARAMS.max_depth_public_statements_mt) (I think the latter is slightly easier to read).
This appears in a few other places.
There was a problem hiding this comment.
oh wow, this was bad, thanks for catching this!
| if self.public_statements.len() > self.params.max_public_statements { | ||
|
|
||
| let public_count = if self.extend_input_pod0_public_statements { | ||
| self.input_pods[0].public_statements.len() |
There was a problem hiding this comment.
This will panic if no input POD is provided, which might be reasonable but could also be caught earlier, e.g. in reveal or prove, or we could prevent the flag being set if there's no input POD.
There was a problem hiding this comment.
good point, I'll add a check in the extend_input_pod0_public_statements method, so that if the flag is true, the input pods are not empty.
| @@ -930,6 +930,8 @@ pub struct BaseParams { | |||
| /// Number of public statements to hash to calculate the public inputs. Must be equal or | |||
| /// greater than `max_public_statements`. | |||
| pub num_public_statements_hash: usize, | |||
There was a problem hiding this comment.
This seems to be unused now.
| #[derive(Serialize, Deserialize)] | ||
| struct Data { | ||
| public_statements: Vec<Statement>, | ||
| // public_statements_mt: Array, |
There was a problem hiding this comment.
This commented line can be removed
| let (_, proof) = empty_pod.pub_raw_statements_mt().prove(0)?; | ||
| let pad_raw_st = empty_pod.pub_raw_statements()[0].clone(); | ||
| InputPodOpenStatement { | ||
| pod_index: params.max_input_pods - 1, |
There was a problem hiding this comment.
I think this means "we have no input statements, therefore the penultimate POD must be empty". This makes sense when max_input_pods is 2: even if we import no statements from it, we might be extending POD 0's statement tree, but POD 1 should be empty in a case where we are not importing any statements.
I wonder if it breaks or is insufficient with values other than 2 for max_input_pods.
It's also possible that the user might have added a second input POD without importing any statements from it, and what they will see here is a proof failure rather than an error message at the frontend level.
There was a problem hiding this comment.
You're right, there are a few cases where this would break. I think a better option is the following:
- if there's a user-defined input pod at index 0, use one of it's statements as padding.
- otherwise use the statement 0 from the empty pod, which is used as padding and will be at index 0.
There was a problem hiding this comment.
although we could have an input pod with an empty statements tree, then there's nothing to open 😢
There was a problem hiding this comment.
I'll add an assert safeguard in the backend MainPod to guarantee that we catch early input pods with empty statements trees (input pods with empty statement trees don't make any sense)
| siblings: data.proof_siblings.clone(), | ||
| }; | ||
| verify_merkle_proof_existence_circuit(builder, &proof); | ||
| let is_intro = builder.not(pod.is_main); |
There was a problem hiding this comment.
Previously we used to verify that each statement coming from an intro POD was either None or a blank intro statement, but we don't check that here. I'm not sure that it's exploitable but it does seem like a looser check.
There was a problem hiding this comment.
I forgot to document this in the PR message. I've added:
- Change how we distinguish MainPod from IntroPod. An IntroPod was previously detected by inspecting the first statement and looking for an intro statement with blank verifier data hash. Now this would require an opening, so I've extended the public inputs to include a field that signals when the pod is main. If the pod is main, it's verifier data hash must be in the VDSet; so no intro can lie about it. If it claims to be intro, when we open a statement we validate that it's indeed intro and place its verifier data hash inside.
Now when opening, if the pod was self-declared as intro, the normalization step will do:
- If statement is None, return None
- otherwise, replace predicate by Intro(vd_hash)
This means that if an intro pod outputs statements that are not None, and are not Intro(blank) they'll just get replaced by Intro(vd_hash). If an end user doesn't recognize that vd_hash they can treat those statements as garbage.
If the intro pod declares itself as main, then the check that its vd_hash is in the VDSet will fail.
| Self::check_replace_value_with_entry(entries, st_in, st_out)? | ||
| } | ||
| (Self::OpenInputStatement(data), st) => { | ||
| let st_hash = st.hash(); |
There was a problem hiding this comment.
There is a problem with statements from input PODs here. An intro POD's Merkle tree contains hash(raw_statement), because the intro POD can't commit to its own vd_hash.
We can check the statement type by matching on Statement::Intro, but we don't know if we are opening a statement from an intro POD (in which case we get a raw statement) or from a MainPod which is republishing an intro statement (in which case we get a normalized statement).
In my "get episode 1 working" branch I had this code:
let raw_hash = data.raw_statement.hash();
let mt_ok = MerkleTree::verify(data.sts_root, &data.proof, &key, &raw_hash.raw()).is_ok();
let st_matches_raw = match (st, &data.raw_statement) {
// Intro statements in raw form (EMPTY_HASH in IntroPredicateRef.verifier_data_hash):
// accept `st` as the normalized version (with the input pod's vd_hash).
(Statement::Intro(st_ir, st_args), Statement::Intro(raw_ir, raw_args))
if raw_ir.verifier_data_hash == EMPTY_HASH =>
{
st_ir.name == raw_ir.name
&& st_ir.args_len == raw_ir.args_len
&& st_args == raw_args
}
// All other cases (chained MainPods storing already-normalized intro statements,
// non-intro statements) require exact equality.
(a, b) => a == b,
};
mt_ok && st_matches_rawThere was a problem hiding this comment.
Thanks for pointing this out, the implementation of the OpenInputStatement check in the middleware was incorrect, as it was not handing the raw statement from an intro input pod correctly. I've fixed it in 85c137d
The circuit version was correct:
pod2/src/backends/plonky2/circuits/mainpod/mod.rs
Lines 368 to 391 in ec044b2
As you can see it uses the raw statement hash to verify the merkle proof, and then it normalizes it to put it into the table that will later be queried in the OpenInput operation.
|
I've made |
Redesign how public statements are handled.
Previously public statements were a hashed list of consecutive statements. We had an area of the circuit to open all the public statements of input pods, and an area to copy statements to be public in the current pod.
Now the public statements live in a tree of parametrizable depth. We remove the area of input pod public statements and the area of current pod public statements. Instead we introduce an operation to open a public statement from an input pod (which consumes a regular statement slot), which is needed when we want to use an input statement. And each statement slot can become a public statement of the current pod, with a limit on the total by a parameter field.
Both the opening input pod statement and the making a statement public use tables, so the number of times such operations can be made is lower than the total number of statements.
Moreover, a pod can start from an empty tree of statements and only output public statements generated in that pod, or it can extend the tree of statements from the first input pod, which allows us to carry statements forward in a recursive chain of pods at no cost.
Aside from this change
CustomPredicateVerifyQueryTarget::sizests_hashin the serialize/deserialize operations, because that value is better derived from the list of statements for a simpler verification flow.self_statementtoraw_statementbecause we no longer have the concept of self in a pod (we only have it in a custom predicate module). Theraw_statementis used to build the statement tree, and for introduction pods it has an empty placeholder instead of their verifier data hash that needs to be replaced by a normalization step.Params:
Circuit size: 2^16 with 389 free gates
Gate count: