Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel copy ability? #444

Open
mzealey opened this issue Oct 31, 2023 · 1 comment
Open

Parallel copy ability? #444

mzealey opened this issue Oct 31, 2023 · 1 comment

Comments

@mzealey
Copy link

mzealey commented Oct 31, 2023

It looks from reading the code and running in production like the initial data sync will only ever use a single worker process:

copy_tables_data(char *sub_name, const char *origin_dsn,

Is it possible to make this run in parallel when syncing multiple tables up to eg max_logical_replication_workers as postgres logical replication does?

@ApproximateIdentity
Copy link

I've run into this need as well and am experimenting with the idea of just creating a replica set and subscriber per table (or possibly say 10 groups of tables). I've tested and the initial copies all definitely do run in parallel. I'm pretty new to pglogical as well as replication in postgres in general, but I believe this approach is okay with the caveats that (1) obviously the primary needs to be able to handle however many copies you run and (2) once the initial copy is done you end up with that many subscriptions running afterwards. As I understand it, the table filtering is being done on the wal on the primary so the actual outgoing data should be basically the same in the end except the fact that it's being split into multiple streams. That part seems okay, but my main concern is if I have 10 workers running on the primary doing the actual filtering against the wal, will the resources be too heavy? I kind of doubt my server couldn't handle it, but that is my main concern anyway.

But be warned I'm only experimenting with this. Also in my specific case, I don't really need to get the new replica up to date so that we do a switch over afterwards (it's a version migration) meaning that I will be able to remove these subscriptions afterwards and they won't need to hang around forever. Maybe if you can't do that, this approach might not be quite as ideal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants