FTPS upload of large file (800 GB) using TLS 1.3 gets slower and slower after ~4.5h and 360 GB #13097

YvesFoltys · 2024-03-11T13:46:21Z

I did this

Uploaded a file of >800 GB to a server

# ls -l testfile
-rw-r-----    1 root     swsupt   887270539264 Nov 19 07:50 testfile

# curl -T testfile --netrc --insecure --ssl-reqd ftp://1.2.3.4//targetdir/testfile

The upload started as expected but after ~4.5h and ~360 GB the transferrate drops from ~23 MB/s to 500 KB/s and keeps getting slower (~200 KB/s after 5h).

Limiting the TLS version to 1.2, the upload completes in ~10 h and constant 23 MB/s
# curl --tlsv1.2 --tls-max 1.2 -T testfile --netrc --insecure --ssl-reqd ftp://1.2.3.4//targetdir/testfile

Using plain ftps, the upload also works fine in ~10 h and constant 23 MB/s
ftp -s 1.2.3.4

The server runs on IBM i 7.3.0 410 . According to the IBM i support, this issue may be caused by the Curl FTP client not responding to TLSv1.3 rekey requests.
Looking through the known issues, it may also be related to https://curl.se/docs/knownbugs.html#FTPS_upload_data_loss_with_TLS_1
I can't say what happens if the slow upload completes since it would take ages to let it run till completion.

I expected the following

The upload should complete without loss in speed using TLS 1.3

curl/libcurl version

curl -V

curl 8.5.0 (powerpc-ibm-aix7.1.5.0) libcurl/8.5.0 OpenSSL/1.1.1v zlib/1.2.13 libssh2/1.10.0 nghttp2/1.58.0 OpenLDAP/2.5.16
Release-Date: 2023-12-06
Protocols: dict file ftp ftps gopher gophers http https imap imaps ldap ldaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS GSS-API HSTS HTTP2 HTTPS-proxy IPv6 Kerberos Largefile libz NTLM SPNEGO SSL threadsafe UnixSockets

operating system

AIX 7200-05-07-2346

The text was updated successfully, but these errors were encountered:

icing · 2024-03-11T14:18:27Z

Do you have a server we can test this against?

bagder · 2024-03-11T14:27:23Z

this issue may be caused by the Curl FTP client not responding to TLSv1.3 rekey requests

Unlikely. Such a connection would be closed. But also, why would it not handle rekeying? If that is even used here.

Also: is there anything that argues against this instead being an issue in the server end?

YvesFoltys · 2024-03-11T15:05:57Z

Do you have a server we can test this against?

Unfortunately I don't have a public server

Also: is there anything that argues against this instead being an issue in the server end?

I have to admit, that "ftp -s" working was proof to me that the server side works. Just started a new session and captured an iptrace to find that "ftp -s" uses TLS v1.2, which also works for curl...
Will try to find another way to test using TLS v1.3 without curl being involved to check if that works.

YvesFoltys · 2024-03-11T15:27:07Z

Is this strictly internal / on a LAN? Can you be 100% confident that it's not the hosting company / ISP / bandwidth provider bottlenecking for fear of a DDoS attack or something?

Yes, this is internal only. And yes, since we manage the infrastructure as well and the same file between the same systems works with TLS v1.2, I'm certain that no switch, router or firewall causes the problems

YvesFoltys · 2024-03-27T08:20:12Z

I run several tests from different systems targeting the same server

OS       | Tool        | TLS | result
---------------------------------------------------------------------
AIX      | curl 7.61.1 | 1.3 | slows down
AIX      | curl 7.61.1 | 1.2 | works
AIX      | curl 8.6.0  | 1.3 | slows down
AIX      | ftps        | 1.2 | works
RHEL 8.9 | curl 7.61.1 | 1.3 | slows down
RHEL 8.9 | lftp        | 1.3 | works
RHEL 9.3 | curl 7.76.1 | 1.3 | aborts with "Connection reset by peer, errno 104"
RHEL 9.3 | curl 8.6.0  | 1.3 | slows down

So in my perception, the general upload using FTPS and TLS 1.3 works fine. Therefore I would rule out a server side issue.
The first result on RHEL 9.3 (Connection reset by peer) I found interesting. I tested the upload 3 times and the error always occured at 362 GB after ~270 min (+/- 2 min). That is pretty much the point in time where the other curl uploads using TLS 1.3 started to slow down. I don't know why that error occured, but only updating curl changed the result to the upload slowing down again.
It seems to me as if the same issue occured but was handled differently by the different version.
Hopefully that helps in pinning down the problem. I'm currently out of ideas what else to test.

BrianInglis · 2024-03-30T19:06:06Z

Could you run the test the other way round - to any of the other systems?

YvesFoltys · 2024-04-04T13:08:32Z

Unfortunately I don't have the possibility to run it the other way around. But I tested again with AIX and curl 8.6 as client against another server running a more recent IBM i version (7.5 vs. 7.3). Although the upload speed overall was better due to other infrastructure between the systems, the transfer rate again dropped after 362 GB:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 44  826G    0     0   44  364G      0   109M  2:08:50  0:56:45  1:12:05 2386k

So this seems to be related to the amount of data rather than the time.

bagder · 2024-04-12T08:12:58Z

Obviously there is no code or condition anywhere in curl that does anything different after some specific amount of bytes transferred or time spent.

My wild guess is that something is done on the TLS layer after some specific time and that triggers a different code path or something in OpenSSL that makes it run slower.

It would be interesting to know if curl built with another TLS library or even a current OpenSSL version would behave differently.

BrianInglis · 2024-04-13T07:13:27Z

Could this be due to TLS session key renegotiation after some byte or time limit?
This could depend on the underlying stacks.
Are both ends running the same TLS/SSL stacks, what versions are those stacks, and what are those TLS session key renegotiation byte or time limits?
May be time to look at a run with -v, --verbose or some --trace... option(s) that do not log 400GB+ data!

bagder added TLS FTP performance labels Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FTPS upload of large file (800 GB) using TLS 1.3 gets slower and slower after ~4.5h and 360 GB #13097

FTPS upload of large file (800 GB) using TLS 1.3 gets slower and slower after ~4.5h and 360 GB #13097

YvesFoltys commented Mar 11, 2024

icing commented Mar 11, 2024

bagder commented Mar 11, 2024

YvesFoltys commented Mar 11, 2024

YvesFoltys commented Mar 11, 2024

YvesFoltys commented Mar 27, 2024

BrianInglis commented Mar 30, 2024

YvesFoltys commented Apr 4, 2024

bagder commented Apr 12, 2024

BrianInglis commented Apr 13, 2024

FTPS upload of large file (800 GB) using TLS 1.3 gets slower and slower after ~4.5h and 360 GB #13097

FTPS upload of large file (800 GB) using TLS 1.3 gets slower and slower after ~4.5h and 360 GB #13097

Comments

YvesFoltys commented Mar 11, 2024

I did this

I expected the following

curl/libcurl version

curl -V

operating system

icing commented Mar 11, 2024

bagder commented Mar 11, 2024

YvesFoltys commented Mar 11, 2024

YvesFoltys commented Mar 11, 2024

YvesFoltys commented Mar 27, 2024

BrianInglis commented Mar 30, 2024

YvesFoltys commented Apr 4, 2024

bagder commented Apr 12, 2024

BrianInglis commented Apr 13, 2024