Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/state: add code to state reader #30808

Closed
wants to merge 5 commits into from
Closed

Conversation

s1na
Copy link
Contributor

@s1na s1na commented Nov 25, 2024

Contract code is technically also a part of the state. This PR adds code to the state reader interface. We also add code size because code size is separately cached in the cachingDB (although I think the overhead of computing the code size on the consumer end should not be too bad).

The motivation is #30441 where we are overriding the state reader interface to hook into state read events (account load, storage load and code load). You can see how that is done here: s1na@a22b30b#diff-abb7c3ba4a7ea95d325d17a5c93b0ebde2c5b43e67f2bc3eb0f76a0c5fc7a8cfR335-R337

// ContractCode returns the code associated with a particular account.
//
// - Returns an empty code if it does not exist
// - It can return an error to indicate code doesn't exist
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks consistency to the rest of the interface. I have done this to avoid any semantics change in this PR.

I am a bit puzzled why the statedb is recording "not found" errors for code but not for storage or account.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It says "empty code if it does not exist". Pls clarify.
It says "can return an error to indicate code doesn't exist". Well, does it, or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry clarified. It will return an error if code is not found.

core/state/reader.go Outdated Show resolved Hide resolved
for _, reader := range r.readers {
// Skip state reader as it doesn't provide contract code and
// always returns an error.
if _, ok := reader.(*trieReader); ok {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a hack: snapshot reader can never return code. So we skip it here and only focus on trie reader.

  • If the state reader returns empty result and no error the multi reader can confuse it for empty code.
  • If the state reader returns error it can cause multi reader to always return an error and confuse consumer of the Reader.

Comment on lines 53 to 54
// - It returns an error to indicate code doesn't exist
// - The returned code is safe to modify after the call
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want the docs to be really clear on semantics here. Here, you say that an error is returned if code doesn't exist. So, does it return errors if you ask for the contract code of anything which has no code? Or only if the account does not exist in the trie?

I mean, there are more points that could be clarified too:
How about asking for the code of a contract which is currently marked as selfdestructed? How about during initcode execution, or just after creation?

Also, the implementations (snapshot) seem to return error just because they are unable to provide the code. That seems like a spec-flaw. The API doesn't allow the implementor to signal "I'm unable to do this", thus forcing it to signal "the code doesn't exist", which is not necessarily the case. An API-wart, IMO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not beautiful, but you could e.g. add another error, so the underlying code can signal NotSupportedErr or something

Copy link
Contributor Author

@s1na s1na Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want the docs to be really clear on semantics here. Here, you say that an error is returned if code doesn't exist. So, does it return errors if you ask for the contract code of anything which has no code? Or only if the account does not exist in the trie?

I'm not sure how to formulate this better. What happens is we query disk by code hash. Address is not used at all. I believe it is kept in the (cachingDB) interface because of verkle.

So the semantic is return value only if code with that hash is found and it has size > 0.

How about asking for the code of a contract which is currently marked as selfdestructed? How about during initcode execution, or just after creation?

The reader doesn't care about evm semantics as it consults disk directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we should not have address in there... ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the implementations (snapshot) seem to return error just because they are unable to provide the code. That seems like a spec-flaw. The API doesn't allow the implementor to signal "I'm unable to do this", thus forcing it to signal "the code doesn't exist", which is not necessarily the case. An API-wart, IMO

Right. I have now added a method "SupportsCodeQuery() bool" to the interface which should return true if a reader allows you to fetch code. I'm not particularly fond of the name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we should not have address in there... ?

Summoning @rjl493456442 as I tried to stay consistent with state.Database

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The address was added for verkle. In verkle, the contract code is tight with the contract.

@rjl493456442
Copy link
Member

One thing I am not happy with this implementation is: for all reader implementations, now they are tight with code reader.

Theoretically, we can divide reader into several different parts: state reader (account reader, storage reader) and code reader. The existing readers (state reader and trie reader) are actually the implementations of account reader and storage reader.

Please give me a bit more time to figure out the proper interface definition here. Ideally, it would be:

type Reader interface {
    AccountReader
    StorageReader
    CodeReader
}

And we can somehow combine different types of reader together and assemble them as a unified Reader.

@s1na
Copy link
Contributor Author

s1na commented Nov 28, 2024

Closing in favor of #30816

@s1na s1na closed this Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants