Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rsync obtains very slow the file list of an sshfs mounted path while ls and find are fast (and even boosting rsync) #634

Open
mgutt opened this issue Aug 22, 2024 · 0 comments

Comments

@mgutt
Copy link

mgutt commented Aug 22, 2024

Steps to reproduce
I have a $LOCAL_DIR which contains ~12k files. And I have a $MNT_DIR which is a mounted SFTP server through sshfs which already contains the same files:

sshfs $USER@$HOST:/ "$MNT_DIR" -o IdentityFile=$idfile

Now I execute rsync as follows:

time (
  rsync -v --recursive --times --itemize-changes --exclude '*~' $LOCAL_DIR "$MNT_DIR"
)

Result:

sending incremental file list

sent 237,213 bytes  received 24 bytes  2,711.28 bytes/sec
total size is 8,576,148,859  speedup is 36,150.13

real    1m27.419s
user    0m0.159s
sys     0m0.760s

So around 90 seconds, which is extremely slow as most of the files are inside a single directory.

Then, by accident, I did that:

time (
  ls "$MNT_DIR/2024/KW32" >/dev/null # contains most of the files
  rsync -v --recursive --times --itemize-changes --exclude '*~' $LOCAL_DIR "$MNT_DIR"
)

Result:

sending incremental file list

sent 237,213 bytes  received 24 bytes  158,158.00 bytes/sec
total size is 8,576,148,859  speedup is 36,150.13

real    0m1.814s
user    0m0.119s
sys     0m0.241s

So ls and even find are extremely boosting rsync obtaining the list of files.

The result with --debug=all (without the ls command):

...
recv_files(.)
recv_files(2024) # it showed this line around 90 seconds before it printed the next line
recv_generator(2024/KW32,7)
recv_generator(2024/KW32/filename.pdf,8)
2024/KW32/filename.pdf is uptodate
recv_generator(2024/KW32/filename.png,9)
2024/KW32/filename.png is uptodate
...

Conclusion
It seems ls and find are much more efficient while obtaining the file list of an sshfs mounted path and they even seem to add this information to an sshfs cache or simlar, while rsync is able to use this cache, but it is not able to fill it (even two rsync executions in a row do not boost the second rsync execution).

As ls and find are both fast obtaining the files, I assume this is a bug of rsync.

Versions

$ uname -a
Linux servername 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

cat /etc/*-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.4 LTS"

sshfs -V
SSHFS version 3.7.1
FUSE library version 3.10.5
using FUSE kernel interface version 7.31
fusermount3 version: 3.10.5

rsync -V
rsync  version 3.2.7  protocol version 31

find --version
find (GNU findutils) 4.8.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant