Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updating verify proof range to handle empty proof keys #1901

Open
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

rianhughes
Copy link
Contributor

@rianhughes rianhughes commented Jun 7, 2024

This PR implements VerifyRangeProof.

It builds on an old PR (1873) which didn't handle empty proof keys, hence the name of this PR.

closes #1895
replaces / closes: #1873

Copy link

codecov bot commented Jun 7, 2024

Codecov Report

Attention: Patch coverage is 71.50538% with 106 lines in your changes missing coverage. Please review.

Project coverage is 75.26%. Comparing base (d3c4dd0) to head (1004a9e).

Files Patch % Lines
core/trie/proof.go 67.35% 31 Missing and 32 partials ⚠️
core/trie/trie.go 77.85% 16 Missing and 15 partials ⚠️
core/trie/node.go 69.23% 7 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1901      +/-   ##
==========================================
- Coverage   75.48%   75.26%   -0.22%     
==========================================
  Files          97       97              
  Lines        8341     8655     +314     
==========================================
+ Hits         6296     6514     +218     
- Misses       1517     1566      +49     
- Partials      528      575      +47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rianhughes rianhughes changed the base branch from main to rianhughes/proof-range June 7, 2024 17:04
@rianhughes rianhughes changed the base branch from rianhughes/proof-range to main June 7, 2024 17:05
@rianhughes rianhughes marked this pull request as ready for review June 12, 2024 07:57
@rianhughes rianhughes mentioned this pull request Jun 12, 2024
@rianhughes rianhughes force-pushed the rianhughes/proof-range-test branch 2 times, most recently from db9bf2c to 66e7d38 Compare June 12, 2024 08:25
@rianhughes rianhughes marked this pull request as draft June 12, 2024 08:40
@rianhughes rianhughes marked this pull request as ready for review June 12, 2024 09:16
@rianhughes rianhughes requested a review from kirugan June 12, 2024 09:17
Comment on lines 341 to 344
// It will contain the hashes of the children along the path, but only the key of the children
// along the path. The final node must contain the hash of the leaf if the leaf is set.
// It will not contain the leaf node even if it is set.
func ProofToPath(proofNodes []ProofNode, leafKey *Key, hashF hashFunc) ([]StorageNode, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is much more readable 🎉 Great job!

Now, I cannot understand what is written in the fn's comment.
So we are given proofNodes and constructing storageNodes following along the path to the single leaf node specified by given leafKey.
These storageNodes will have hashes of the children and keys? - not get this "but only the key of the children along the path".

Also: "The final node must contain the hash of the leaf ..." - which node is the final node, root or the nodes at the bottom of the trie 🤔

And: "It will not contain the leaf node even if it is set."
Hmm, the leaf is set because we have the leafKey right? Also the leaf node refers to the storageNodes we are constructing here.

Can you please try to improve this comment?

Copy link
Contributor Author

@rianhughes rianhughes Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These storageNodes will have hashes of the children and keys? - not get this "but only the key of the children along the path".

The storageNodes will have the hashes of the children, but only the key of the child along the path outlined by the proof.

Also: "The final node must contain the hash of the leaf ..." - which node is the final node, root or the nodes at the bottom of the trie 🤔

Ah sorry, this is a confusing sentence. I was just trying to comment that we don't include the leaf node in the generated storage nodes. But if the path outlined by the proof leads the whole way to the leaf, and the leaf is set, then we should include it's hash. (This is kind of redundant given the above sentence)

Hmm, the leaf is set because we have the leafKey right?

Nope. The Key that the proof node points to may not be set, but we can still have the key.

Also the leaf node refers to the storageNodes we are constructing here.

Yup

Copy link
Contributor Author

@rianhughes rianhughes Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the comment

parentBinary := parent.Binary
return parentBinary.LeftHash, parentBinary.RightHash
} else {
parentBinary := proofNodes[parentInd+1].Binary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need ro check that proofNodes[parentInd+1].Binary != nil as we did with the previous node?
also parentInd+1 < len(proofNodes) 🤔

Maybe we should assign the parent and have a single return:

parent := &proofNodes[parentInd]
if parent.Binary == nil {
    parent := &proofNodes[parentInd+1]
}
return parent.Binary.LeftHash, parent.Binary.RightHash

Copy link
Contributor Author

@rianhughes rianhughes Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need ro check that proofNodes[parentInd+1].Binary != nil as we did with the previous node?

Nope, if proofNode[i] is an edge then we know proofNode[i+1] is Binary (it's not possible for an edge node to follow an edge node).

Maybe we should assign the parent and have a single return

Yup, this is much nicer!

parentInd+1 < len(proofNodes)

Good point. Let me think about that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parentInd+1 < len(proofNodes)

I added a check for this

@rianhughes rianhughes requested a review from pnowosie June 12, 2024 13:01
// and therefore it's hash won't match the expected root.
// ref: https://github.com/ethereum/go-ethereum/blob/v1.14.3/trie/proof.go#L484
func VerifyRangeProof(root *felt.Felt, keys, values []*felt.Felt, proofKeys [2]*Key, proofValues [2]*felt.Felt,
proofs [2][]ProofNode, hash hashFunc,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is proofKeys and proofValues? What is the difference from proofs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proofs are the set of nodes from the root to the boundary / proof keys and values

Copy link
Contributor Author

@rianhughes rianhughes Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explicitly labelled the proof keys and values for clarity (I find this is a bit easier to understand than placing the proof key as the first element in keys iff the proof key has actually been set - like seems to be done in geth)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not quite sure... In geth code, the firstKey parameter I'm guessing is there because the left proof should be generated based on the start key query (not actual first leaf that exist). Thats a little nuances.. No idea where proofKeys and proofValues come from. Remember, only one proof array, which in geth case is added like here https://github.com/ethereum/go-ethereum/blob/master/eth/protocols/snap/sync.go#L2816. The proof is like a bundle of nodes data, no idea which node is located where (the key). You only know the state root, and iterate from there to recalculate the path, well, at least nethermind do it that way..

Copy link
Contributor Author

@rianhughes rianhughes Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the left proof should be generated based on the start key query (not actual first leaf that exist)

Currently the left proof key would just correspond to the start key query. And the last proofKey would correspond to the end key query. Similarly with the proof Values (which can be nil if they don't exist)

Remember, only one proof array, ..

Yup, but that should be a simple matter of merging the proof nodes into a single array, and then splitting them again, right? I've opened an issue for this

Copy link
Contributor Author

@rianhughes rianhughes Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, regarding there only being a single proof, geth does store the proof nodes in a key-value database, but I believe there are two boundary proofs, which you can see here. I've kept them separate for clarity.

Afaiu, when there are key-value pairs to the left and right of the given range we need two boundary proofs. This is because we need to compute the correct shape of the sub-trie, and store the relevant hashes so that we can recompute the root hash for that range / sub trie. Having only a single boundary proof would no allow us to do this.

@rianhughes rianhughes requested a review from asdacap June 14, 2024 11:18
}

leftHashB := n.LeftHash.Bytes()
wrote, err = buf.Write(leftHashB[:])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an encoding change. Is this intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I figured it made sense to store the LeftHash and RightHash fields in the node type

@rianhughes
Copy link
Contributor Author

@asdacap, @pnowosie in geth, the VerifyRangeProof executes the verification logic, but also logic to determine if there are more leaves to the right (see here). Imo it would be neater to separate these out into two different functions, hence why I didn't insert it in this implementation of VerifyRangeProof. What are your thoughts?

@asdacap
Copy link
Contributor

asdacap commented Jun 27, 2024

You can separate them internally if you want. But at high level, I think its easier to combine them, otherwise, need to pass the keys of proof explicitly, which is implementation details IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VerifyProofRange non-set and inner nodes
3 participants