Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Geyser Plugin] Provide account state diffs instead of entire account (discussion) #35487

Closed
bmuddha opened this issue Apr 3, 2024 · 1 comment
Labels
community Community contribution

Comments

@bmuddha
Copy link

bmuddha commented Apr 3, 2024

Problem

Currently Geyser plugin interface provides implementers with full account state which may (or may not) have changed since last time it was received. While it's suitable for most use cases, it's wildly inefficient in general, as it wastes a lot of bandwidth, storage and processing resources.

It would've had a huge performance impact, if instead of providing the entirety of account state, geyser plugin interface only supplied diffs between previously (finalized) and recently changed account states. This would greatly reduce the size of an update for the most common use case of geyser plugins when accounts are streamed to external storage, which implies reduction in bandwidth consumption and processing required to serialize/deserialize data, among other benefits.

Proposed Solution

In it's current implementation plugin's update_account method is invoked from accountsdb (several levels up the call stack), if we somehow can get access to BankRc, then it should be possible to load previous version of account (e.g. with load_account_with_fixed_root method), may be there are other more efficient method to obtain previous account state which I'm not aware of.

From there it should be pretty straightforward to obtain diff between previous and recent state.

Diff could be as simple as tag of the modified field (some predefined number) mapped to value of the same field, for example:
Field tags:

  1. lamports
  2. owner
  3. executable
  4. rent_epoch
  5. data

In case if only 1. lamports field have changed, then the whole diff will comprise of 9 bytes: 1 byte for tag and 8 for the value of lamports field.
255. data field can be represented as a list of offset:len:bytes triple, which can be used to reconstruct those modifications if previous account state is available at destination.

Snapshot accounts and newly created accounts should be provided fully.

The only problem I see with this approach is the additional overhead involved with diff calculation, but it should be somewhat offset by the fact that plugin implementations will have less work to do.

I'm not an expert in Solana internals, so I might have missed some important details which will make such a solution impossible to implement, so any constructive criticism is more than welcome.

@bmuddha bmuddha added the community Community contribution label Apr 3, 2024
Copy link
Contributor

github-actions bot commented Apr 3, 2024

This repository is no longer in use. Please re-open this issue in the agave repo: https://github.com/anza-xyz/agave

@github-actions github-actions bot closed this as completed Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Community contribution
Projects
None yet
Development

No branches or pull requests

1 participant