You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to transfer a files between 1TB and 6TB in size across a link with high speed, but also high latency due to the distance between the physical servers. Once or twice a week, there's a network issue, and the connection is lost. Restarting the transfer takes a VERY long time -- one transfer of 3.3TB failed at the 550GB mark, and is taking more than an hour to resume.
My best guess is that with the maximum checksum size of 128k, and a file size of 550GB, the back-and-forth chit-chat of checksums is the cause of the slowness. I'm presuming that dramatically increasing the size of the checksum (1MB-1GB) would improve this dramatically, even if there were a slightly higher cost in terms of re-transmitting some of the data.
BONUS POINTS:
I have two ideas for improving this dramatically:
By default, select a checksum size equal to 1% of the file size (rounded up or down if necessary) so that there would be no more than 100 checksums exchanged, or...
A dynamic checksum size that starts at 1MB, and increases by 2x with each successful checksum (up to, say 1GB), and is reduced by 50% with each failure. This would dramatically reduce the number of checksums exchanged between client and server.
The text was updated successfully, but these errors were encountered:
I'm trying to transfer a files between 1TB and 6TB in size across a link with high speed, but also high latency due to the distance between the physical servers. Once or twice a week, there's a network issue, and the connection is lost. Restarting the transfer takes a VERY long time -- one transfer of 3.3TB failed at the 550GB mark, and is taking more than an hour to resume.
My best guess is that with the maximum checksum size of 128k, and a file size of 550GB, the back-and-forth chit-chat of checksums is the cause of the slowness. I'm presuming that dramatically increasing the size of the checksum (1MB-1GB) would improve this dramatically, even if there were a slightly higher cost in terms of re-transmitting some of the data.
BONUS POINTS:
I have two ideas for improving this dramatically:
By default, select a checksum size equal to 1% of the file size (rounded up or down if necessary) so that there would be no more than 100 checksums exchanged, or...
A dynamic checksum size that starts at 1MB, and increases by 2x with each successful checksum (up to, say 1GB), and is reduced by 50% with each failure. This would dramatically reduce the number of checksums exchanged between client and server.
The text was updated successfully, but these errors were encountered: