Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big plan for partial replication #14

Open
3 of 10 tasks
staltz opened this issue Jan 5, 2023 · 2 comments
Open
3 of 10 tasks

Big plan for partial replication #14

staltz opened this issue Jan 5, 2023 · 2 comments

Comments

@staltz
Copy link
Member

staltz commented Jan 5, 2023

(Poor title for this issue, and I'm not sure which repo to put this in, but this repo seems to be the highest level)

Context

Problems

Consider what happens to messages in the log (avoid duplicates, avoid ssb-ebt replication bugs and incorrect state) when:

  1. Switching from partial replication to full replication
    • Due to hops change: e.g. when a peer used to be hops 2 for you (and you had configured indexed feed replication for hops 2) but they change to hops 1 (which configures to replicate the main feed) because you followed them
    • Due to subset replication: e.g. when you fetched and appended a metafeed/announce msg (from the main feed) via ssb-subset-rpc but after that replicated the full main feed for that peer
  2. Switching from full replication to partial replication
    • Due to upgrading to metafeeds: e.g. when a peer adopts a new version of the JS stack that has metafeeds, your peer will detect that and switch replication from full to partial (e.g. using indexed feeds)
  3. Indexed feeds plus full replication
    • If a pub wants to fully replicate main feed AND replicate indexed-feed msgs then ssb-ebt needs to be changed so that indexed replication DOESNT do addOOO.

cc @arj03 FYI

Tasks

@arj03
Copy link
Member

arj03 commented Jan 5, 2023

I think these two points are performance improvements (given that we do the get check in ebt):

  • ssb-db2: New API to synchronously get the state (base.getLatest() is not that, it does a leveldb read)
  • ssb-ebt: Update indexed.js to refuse payloads in case state exists for that feed, and this must improve benchmarks

And therefor might be done at the end, if needed really. Because if we start with relatively small indexes (about, contact), then the performance of the current approach might be ok. I think actually one of the biggest wins might be that we don't replicate and (try to) decrypt private messages.

And for: ssb-replication-scheduler: tests for all the use cases mentioned in "Problems" above there is a test in ssbc/ssb-ebt#72. Still might be good to test this here in some "overall" setting.

Another thing is that we might want to revive the benchmarks we did in the other grant to compare before and after partial replication.

@arj03
Copy link
Member

arj03 commented Jan 5, 2023

Another thing, there is a task for:

ssb-replication-scheduler: delete the main feed when switching from full replication to metafeeds and indexed feeds replication

But not the other way, going from partial (indexes) to full replication. The tests should show there is a problem in any case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants