Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

triedb/pathdb: track flat state changes in pathdb (snapshot integration pt 2) #30643

Merged
merged 7 commits into from
Nov 29, 2024

Conversation

rjl493456442
Copy link
Member

@rjl493456442 rjl493456442 commented Oct 21, 2024

This pull request ports some changes from the main state snapshot integration one, specifically introducing the flat state tracking in pathdb.

Note, the tracked flat state changes are only held in memory and won't be persisted in the disk. Meanwhile, the correspoding state retrieval in persistent state is also not supported yet. The states management in disk is more complicated and will be implemented in a separate pull request.

Part 1: #30752

triedb/database/database.go Outdated Show resolved Hide resolved
triedb/pathdb/disklayer.go Show resolved Hide resolved
@holiman holiman changed the title triedb/pathdb: track flat state changes in pathdb triedb/pathdb: track flat state changes in pathdb (snapshot integration pt 2) Oct 23, 2024
triedb/pathdb/states.go Outdated Show resolved Hide resolved
triedb/pathdb/states.go Outdated Show resolved Hide resolved
triedb/pathdb/states.go Outdated Show resolved Hide resolved
triedb/pathdb/states.go Outdated Show resolved Hide resolved
triedb/pathdb/states.go Outdated Show resolved Hide resolved
triedb/pathdb/states.go Outdated Show resolved Hide resolved
triedb/pathdb/states.go Outdated Show resolved Hide resolved
}

// encode serializes the content of state set into the provided writer.
func (s *stateSet) encode(w io.Writer) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We typically do not use rlp like this, encoding four times with this type of manual steps. How come you don't use something more auto-generated?
Is there some optimization at play here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not about optimization at all (also performance is not important here).

It's more like a hand-written encode rules. In StateSet, it has several maps and maps are not suitable/supported for RLP encoding.

Therefore, these code tells RLP encoder how to pack the data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to define a stateSetMarshalling type then? And, in this encode method, conver the fields to the stateSetMarshalling and then just rlp-encode that struct?
Rigth now it seems you are encoding fields back to back, with no envelope between stateSet items ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes for sure, I can have a try!

triedb/pathdb/states.go Outdated Show resolved Hide resolved
triedb/pathdb/states.go Outdated Show resolved Hide resolved
triedb/pathdb/states.go Outdated Show resolved Hide resolved
return b
}

// revert is the reverse operation of commit. It also merges the provided states
// and trie nodes into the buffer. The key difference is that the provided state
// set should reverse the changes made by the most recent state transition.
func (b *buffer) revert(db ethdb.KeyValueReader, nodes map[common.Hash]map[string]*trienode.Node) error {
func (b *buffer) revert(db ethdb.KeyValueReader, nodes map[common.Hash]map[string]*trienode.Node, accounts map[common.Hash][]byte, storages map[common.Hash]map[common.Hash][]byte) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the semantics about revert a bit confusing. The difference between merge and revert seems mostly to be an expectation on the direction of change.
But for some reason, nodeset.revert and nodeset.merge are quite dissimilar....

So, essentially, the nodes here represent a bunch of trienodes, by path, which were removed at an earlier state progression. And now we want to put them back as they were?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nodes here represent a bunch of trienodes, by path, with the original value, which were modified at an earlier state progression.

At an earlier state progression,

if the node was deleted, then the value in the nodes here will be the original value and it will be reset back here;
if the node was created, then the value in the nodes here will be the null and it will be deleted here
if the node was modified, then the value in the nodes here will be the original value and it will be reset back

All in all, revert will reset the mutated nodes back to their original value.

But for some reason, nodeset.revert and nodeset.merge are quite dissimilar....

Yeah, essentially, nodeset.merge aggregates two nodeset together, overwrite the node value with the later one
node.revert undoes the last state mutation by resetting the node value back

triedb/pathdb/states.go Outdated Show resolved Hide resolved
triedb/pathdb/difflayer.go Outdated Show resolved Hide resolved
triedb/pathdb/difflayer.go Outdated Show resolved Hide resolved
triedb/pathdb/states.go Outdated Show resolved Hide resolved
triedb/pathdb/states.go Show resolved Hide resolved
triedb/pathdb/states.go Outdated Show resolved Hide resolved
triedb/pathdb/states.go Show resolved Hide resolved
triedb/pathdb/states.go Show resolved Hide resolved
triedb/pathdb/states.go Show resolved Hide resolved
triedb/pathdb/states.go Show resolved Hide resolved
}

// account returns the account data associated with the specified address hash.
func (s *stateSet) account(hash common.Hash) ([]byte, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get this (same for storage below). You do not use the readlock in these methods. Do you assume that the caller handles the readlocking? If so, it seems pretty inconsistent.
They way you do it might be totally fine, but you should document somewhere what assumptions exist, regarding the locking model.
E.g. "It is assumed that calls to account/storage happen via disklayer/difflayer and are already readlocked. The mu is used only to prevent foo bar gazonk... "

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you're more or less doing the same locking as already existing code. E.g. merge happens when the disklayer does commit, and that only happens when the disklayer writelock is held. You can probably ignore my comments about locking

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the stateSet is an internal structure, it can only be held in diffLayer, or the buffer inside of diskLayer.

The stateSet itself doesn't need lock (it's lock free). Whenever the stateSet is modified (bottom-most diffLayer is merged into buffer), the original diskLayer will be marked as stale first, to prevent following access to the original stateSet. Then the new stateSet after mutation will be linked with new diskLayer.

In another word, the associated stateSet can be regarded as immutable (read-only) if the corresponding diffLayer or diskLayer is "unstale".

@@ -244,7 +332,7 @@ func (dl *diskLayer) revert(h *history) (*diskLayer, error) {
// needs to be reverted is not yet flushed and cached in node
// buffer, otherwise, manipulate persistent state directly.
if !dl.buffer.empty() {
err := dl.buffer.revert(dl.db.diskdb, nodes)
err := dl.buffer.revert(dl.db.diskdb, nodes, accounts, storages)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This

If the node
	// buffer is not empty, it means that the state transition that
	// needs to be reverted is not yet flushed and cached in node
	// buffer, otherwise, manipulate persistent state directly.

So, if that happens... A few lines up, we did this:

	// Apply the reverse state changes upon the current state. This must
	// be done before holding the lock in order to access state in "this"
	// layer.
	nodes, err := apply(dl.db, h.meta.parent, h.meta.root, h.accounts, h.storages)

Using the dl.db as the nodereader. But some changes were not already flushed to that nodereader. Does that matter? Or are those two separate things?

Copy link
Member Author

@rjl493456442 rjl493456442 Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever a state transition occurs, the state mutation set will be added as a new layer in the pathdb. If there are too many in-memory layers, some of the bottom-most layers will be aggregated and may eventually be flushed to disk. The aggregated or flushed layers might not be visible if they are overwritten by other layers.

For example, if there is a layer x in the buffer (diskLayer) and layer y is aggregated into buffer, then x is no longer visible. Similar, if the current persistent state is x, this state will be invisible if state y is flushed into disk.

In the revert function, we aim to revert the disk layer to previous state. The situations can be categorized into two scenarios:

  • If the buffer of the disk layer contains some layers, we simply undo the aggregation of the last layer in the buffer
  • If the buffer is empty, we undo the persistent state flush of the last layer

The mutation set for the last layer can be constructed using the supplied history object. The construction takes the current disk layer and the history object as inputs, producing the mutation set of the last state transition as the output.

Specifically, the dl.db and h.meta.root donates the current state of disk layer, which is always available!


In summary, we can always construct a mutation set for revert by having the disk layer and a corresponding history object. The constructed mutation set can undo the last state change of disk layer, no matter the state change is cached in the buffer, or applied in the disk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So either we revert the last thing on disk (easy path) or we need to revert part of the buffer before writing it on disk? Both will be accomplished via the reverse diffs? So the assumption is that the reverse diffs are always available and never merged, but written to disk one by one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So either we revert the last thing on disk (easy path) or we need to revert part of the buffer before writing it on disk?

Yes

Both will be accomplished via the reverse diffs

Yes, no matter the state transition being reverted is in the disk, or it's in the buffer, the relevant state set is all constructed from the reverse diff.

So the assumption is that the reverse diffs are always available and never merged

No and Yes. The reverse diffs might be pruned if it's very old (we keep 90K latest reverse diffs by default and users can specify the number of reverse diffs to keep by themselves). If the reverse diff is not available, then the state revert will be rejected.

The reverse diffs are always stored individually, one by one.

// hash in the slim data format.
//
// Note the returned account is not a copy, please don't modify it.
func (dl *diffLayer) account(hash common.Hash, depth int) ([]byte, error) {
Copy link

@will-2012 will-2012 Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hit := dl.diffed.ContainsHash(accountBloomHash(hash))
if !hit {
hit = dl.diffed.ContainsHash(destructBloomHash(hash))
}
var origin *diskLayer
if !hit {
origin = dl.origin // extract origin while holding the lock
}

If without bloomfilter, will it cause negative impact on performance?

Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This pull request ports some changes from the main state snapshot
integration one, specifically introducing the flat state tracking
in pathdb.

Note, the tracked flat state changes are only held in memory and
won't be persisted in the disk. Meanwhile, the correspoding state
retrieval in persistent state is also not supported yet. The states
management in disk is more complicated and will be implemented in
a separate pull request.
@fjl fjl removed the status:triage label Nov 28, 2024
Copy link
Member

@MariusVanDerWijden MariusVanDerWijden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}
delta += len(blob) - len(data)
slots[storageHash] = blob
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So previously an empty blob would result in slots[hash] = nil, while it will be []byte now.
I'm pretty sure its okay, but I just wanted to double check

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we don't really distinguish the nil and []byte{} here.

@rjl493456442 rjl493456442 added this to the 1.14.13 milestone Nov 29, 2024
@rjl493456442 rjl493456442 merged commit 05148d9 into ethereum:master Nov 29, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants