Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-T - always uploads with "Transfer-Encoding: chunked" #12171

Open
emanuele6 opened this issue Oct 20, 2023 · 3 comments · May be fixed by #12178
Open

-T - always uploads with "Transfer-Encoding: chunked" #12171

emanuele6 opened this issue Oct 20, 2023 · 3 comments · May be fixed by #12178

Comments

@emanuele6
Copy link
Contributor

emanuele6 commented Oct 20, 2023

I did this

curl -T - localhost:40000 < ~/.bashrc
$ nc -l 40000
PUT / HTTP/1.1
Host: localhost:40000
User-Agent: curl/8.4.0
Accept: */*
Transfer-Encoding: chunked
Expect: 100-continue

30e
[...]

I expected the following

$ nc -l 40000
PUT / HTTP/1.1
Host: localhost:40000
User-Agent: curl/8.4.0
Accept: */*
Content-Length: 782

[...]

curl should realise that stdin is a regular file, and that it can do a non-chunked upload, like it would do if you used -T ~/.bashrc.
curl already does that when you use @- or <- with -F/-d and similar, but not when you use -T. (EDIT: Actually, that is not true; I was misremembering; curl only knows the content length for stdin with -F/-d because it always loads the entire file into memory before sending it)

The problem is that curl -T never stats stdin to figure if it is a regular file when you use - as argument.
On Linux and BSDs, a possible workround is -T /dev/stdin (with --request-target /); since a path was provided, curl will stat it and realise it is a regular file, but it would be nicer if curl would just do that on its own.

I wanted to create a patch to fix this bug when I first learned about it, but the code that deals with stat-ing the file for upload (in src/tool_operate.c) is doing something special on #define __VMS systems to figure out the real upload size of regular files that involes opening the file, and reading all its data to count bytes. I didn't know why it did that, or how to do that for stdin if the system is VMS (maybe using lseek to go to the start of stdin), and anyway I would not be able to test the code I write for VMS on VMS, so I gave up trying to write the patch myself to avoid accidentally breaking the VMS port.

I remembered about this bug today, and noticed no one created an issue for it.

Ref: #11396 (comment)

curl/libcurl version

curl 8.4.0

curl 8.4.0 (x86_64-pc-linux-gnu) libcurl/8.4.0 OpenSSL/3.1.3 zlib/1.3 brotli/1.1.0 zstd/1.5.5 libidn2/2.3.4 libpsl/0.21.2 (+libidn2/2.3.4) libssh2/1.11.0 nghttp2/1.57.0
Release-Date: 2023-10-11
Protocols: dict file ftp ftps gopher gophers http https imap imaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS brotli GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM PSL SPNEGO SSL threadsafe TLS-SRP UnixSockets zst

operating system

Arch Linux

Linux t420 6.1.57-1-lts #1 SMP PREEMPT_DYNAMIC Wed, 11 Oct 2023 05:10:50 +0000 x86_64 GNU/Linux
emanuele6 added a commit to emanuele6/curl that referenced this issue Oct 21, 2023
curl will now also compute the content-length of the transfer if stdin
is the file to upload and stdin is a regular file, using its file size.

Since, while being a regular file, stdin could not have its offset at
the start of the file, curl will now also get the current offset into
the upload file's file descriptor and use (filesize - offset) as
content-length for transfer instead of just using the full filesize.
This also fixes a bug on BSDs where open("/dev/fd/N") behaves like
dup(N), so, if N is a file descriptor to a regular file, the file offset
of the file descriptor returned by open() may not have been at the start
of the file despite curl's previous assumption.

Since I don't know anything about VMS systems, I left the behaviour for
VMS unchanged; on VMS, curl will still perform a chunked transfer if the
upload file is stdin.

Fixes curl#12171
Fixes curl#12177
emanuele6 added a commit to emanuele6/curl that referenced this issue Oct 21, 2023
curl will now also compute the content-length of the transfer if stdin
is the file to upload and stdin is a regular file, using its file size.

Since, while being a regular file, stdin could not have its offset at
the start of the file, curl will now also get the current offset into
the upload file's file descriptor and use (filesize - offset) as
content-length for transfer instead of just using the full filesize.
This also fixes a bug on BSDs where open("/dev/fd/N") behaves like
dup(N), so, if N is a file descriptor to a regular file, the file offset
of the file descriptor returned by open() may not have been at the start
of the file despite curl's previous assumption.

Since I don't know anything about VMS systems, I left the behaviour for
VMS unchanged; on VMS, curl will still perform a chunked transfer if the
upload file is stdin.

Fixes curl#12171
Fixes curl#12177
@bagder
Copy link
Member

bagder commented Oct 21, 2023

I'm not following this.

curl uses chunked for -T from stdin because it does not know the size beforehand. How is that a bug?

@emanuele6
Copy link
Contributor Author

emanuele6 commented Oct 21, 2023

It does not know its size only because it is not checking it, and there is no reason why it couldn't check it if stdin is a regular file.

If you run for example curl -T - localhost:40000 < ~/.bashrc, stdin is a file descriptor to a regular file, ~/.bashrc, so it could get its size using fstat() as it already does after open()ing a file when you do curl -T /path/to/file, and do an unchunked transfer, but it doesn't check and use the file size as content-length when you do -T -.

@bagder
Copy link
Member

bagder commented Oct 21, 2023

stdin is a file descriptor to a regular file

Wow, I had no idea it works like this...! 😮

emanuele6 added a commit to emanuele6/curl that referenced this issue Oct 24, 2023
curl will now also compute the content-length of the transfer if stdin
is the file to upload and stdin is a regular file, using its file size.

Since, while being a regular file, stdin could not have its offset at
the start of the file, curl will now also get the current offset into
the upload file's file descriptor and use (filesize - offset) as
content-length for transfer instead of just using the full filesize.
This also fixes a bug on BSDs where open("/dev/fd/N") behaves like
dup(N), so, if N is a file descriptor to a regular file, the file offset
of the file descriptor returned by open() may not have been at the start
of the file despite curl's previous assumption.

Since I don't know anything about VMS systems, I left the behaviour for
VMS unchanged; on VMS, curl will still perform a chunked transfer if the
upload file is stdin.

I updated tests that were testing chunked transfers for stdin to use
<stdin pipe="yes"> since now curl only performs a chunked transfer by
default if the upload file (stdin or not) is not a regular file.

Fixes curl#12171
Fixes curl#12177
emanuele6 added a commit to emanuele6/curl that referenced this issue Oct 24, 2023
curl will now also compute the content-length of the transfer if stdin
is the file to upload and stdin is a regular file, using its file size.

Since, while being a regular file, stdin could not have its offset at
the start of the file, curl will now also get the current offset into
the upload file's file descriptor and use (filesize - offset) as
content-length for transfer instead of just using the full filesize.
This also fixes a bug on BSDs where open("/dev/fd/N") behaves like
dup(N), so, if N is a file descriptor to a regular file, the file offset
of the file descriptor returned by open() may not have been at the start
of the file despite curl's previous assumption.

Since I don't know anything about VMS systems, I left the behaviour for
VMS unchanged; on VMS, curl will still perform a chunked transfer if the
upload file is stdin.

I updated tests that were testing chunked transfers for stdin to use
<stdin pipe="yes"> since now curl only performs a chunked transfer by
default if the upload file (stdin or not) is not a regular file.

Fixes curl#12171
Fixes curl#12177
emanuele6 added a commit to emanuele6/curl that referenced this issue Oct 24, 2023
curl will now also compute the content-length of the transfer if stdin
is the file to upload and stdin is a regular file, using its file size.

Since, while being a regular file, stdin could not have its offset at
the start of the file, curl will now also get the current offset into
the upload file's file descriptor and use (filesize - offset) as
content-length for transfer instead of just using the full filesize.
This also fixes a bug on BSDs where open("/dev/fd/N") behaves like
dup(N), so, if N is a file descriptor to a regular file, the file offset
of the file descriptor returned by open() may not have been at the start
of the file despite curl's previous assumption.

Since I don't know anything about VMS systems, I left the behaviour for
VMS unchanged; on VMS, curl will still perform a chunked transfer if the
upload file is stdin.

I updated tests that were testing chunked transfers for stdin to use
<stdin pipe="yes"> since now curl only performs a chunked transfer by
default if the upload file (stdin or not) is not a regular file.

Fixes curl#12171
Fixes curl#12177
emanuele6 added a commit to emanuele6/curl that referenced this issue Oct 24, 2023
curl will now also compute the content-length of the transfer if stdin
is the file to upload and stdin is a regular file, using its file size.

Since, while being a regular file, stdin could not have its offset at
the start of the file, curl will now also get the current offset into
the upload file's file descriptor and use (filesize - offset) as
content-length for transfer instead of just using the full filesize.
This also fixes a bug on BSDs where open("/dev/fd/N") behaves like
dup(N), so, if N is a file descriptor to a regular file, the file offset
of the file descriptor returned by open() may not have been at the start
of the file despite curl's previous assumption.

Since I don't know anything about VMS systems, I left the behaviour for
VMS unchanged; on VMS, curl will still perform a chunked transfer if the
upload file is stdin.

I updated tests that were testing chunked transfers for stdin to use
<stdin pipe="yes"> since now curl only performs a chunked transfer by
default if the upload file (stdin or not) is not a regular file.
And I added a new test (258) that verifies that stdin is uploaded with
content-length (not transfer-encoding:chunked) if it is a regular file.

Fixes curl#12171
Fixes curl#12177
emanuele6 added a commit to emanuele6/curl that referenced this issue Oct 24, 2023
curl will now also compute the content-length of the transfer if stdin
is the file to upload and stdin is a regular file, using its file size.

Since, while being a regular file, stdin could not have its offset at
the start of the file, curl will now also get the current offset into
the upload file's file descriptor and use (filesize - offset) as
content-length for transfer instead of just using the full filesize.
This also fixes a bug on BSDs where open("/dev/fd/N") behaves like
dup(N), so, if N is a file descriptor to a regular file, the file offset
of the file descriptor returned by open() may not have been at the start
of the file despite curl's previous assumption.

Since I don't know anything about VMS systems, I left the behaviour for
VMS unchanged; on VMS, curl will still perform a chunked transfer if the
upload file is stdin.

I updated tests that were testing chunked transfers for stdin to use
<stdin pipe="yes"> since now curl only performs a chunked transfer by
default if the upload file (stdin or not) is not a regular file.
And I added a new test (268) that verifies that stdin is uploaded with
content-length (not transfer-encoding:chunked) if it is a regular file.

Fixes curl#12171
Fixes curl#12177
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants