-
-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use multithreaded zstd compression #8217
Comments
Multithreading is planned for after borg 2. The problem with approaches like the compressor internally implementing multithreading is that borg has a chunk size of typically 2MiB (and that is only effective if the file is larger than that, but a lot of files are smaller). This already relatively small chunk then has to get split into e.g. 4 pieces (making it even much smaller), 4 threads have to get started and terminated. So there is a lot of overhead and only a little of data to get compressed per thread. Considering that this is only needed for new data and a lot of data usually doesn't change (and thus doesn't need compression), it is usually not a big concern for the 2nd+ backup. Exceptions are users with a ton of new data each day and also first time backups. A better approach is to have borg implement multithreading (and pipelining), so that the chunks don't need to get split further into smaller pieces. But that is a very big change and other big changes (see master branch) have higher prio. |
Thanks. Those to mainly talking about multithreading of Borg itself, while I was only considering enable the multithread flag to the zstd library used. This should not need much changes as it is more or less a variable to set. Edit: Didn't see your first message. Full multi threading in borg could definitely improve things, but it Is a much larger task, so it's understandable not a priority for Borg 1.x. |
@Forza-tng I somehow suspect (for reasons I stated in my post above), that just giving that flag to zstd, isn't improving things much. Can you do a practical experiment, implementing that change and comparing performance? Also, we don't use pyzstd, but we directly call the libzstd api via Cython, see compress.pyx. |
Have you checked borgbackup docs, FAQ, and open GitHub issues?
YES
Is this a BUG / ISSUE report or a QUESTION?
ISSUE / QUESTION
System information. For client/server mode post info for both machines.
Your borg version (borg -V).
Borg 1.2.8
Operating system (distribution) and version.
Gentoo Linux
Hardware / network configuration, and filesystems used.
amd64, btrfs, fiber 1gbit/s. Remote storage via ssh.
How much data is handled by borg?
~1TB
Full borg commandline that lead to the problem (leave away excludes and passwords)
borg create --compression zstd,10
Describe the problem you're observing.
When using too high compression level, the Borg process gets pinned at 100% CPU of one core.
zstd supports multithreading which can greatly improve its performance.
zstd -T1
shows roughly the same bandwidth i see with Borg, which leads me to believe that MT option is not enabled.https://pyzstd.readthedocs.io/en/latest/#mt-compression
Another improvement, if not already used, is to use the
--long
option. It allows zstd to use a bigger window for higher gains.The text was updated successfully, but these errors were encountered: