-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect (too long) header from server causes Twisted Agent to fail #8570
Comments
The problem with accepting arbitrarily long headers is that you can still DoS a client by having it continue to read from a socket forever, never processing the response, never timing out. This makes me inclined to "wontfix" this issue, but... I'm not entirely sure how other HTTP clients handle this. What do browsers do? cURL? wget? If other clients typically ignore headers over their max length like this then let's make the change. |
curl handles this without trimming headers (but my fish shell crashes immediately after request so I'm not entirely sure curl is not leaking anything, maybe it causes some segfaults). Google-chrome reads full headers no problem. wget refuses to read headers and retries printing following error:
I think wget approach is best here in that it prints instructive error about impossibility of handling too long headers. I'm not entirely sure it makes sense to retry though. wget probably assumes that those headers are a result of misconfigured server, this assumption might not be true. |
I made some comments regarding this on a review for #9295, Firefox and Safari on iOS also handle the headers just fine. While I agree that allowing arbitrarily long headers is a dangerous, and am not advocating for that by default, it does seem like better communication is preferable to a won't fix. Rationale: The decision to disallow headers after a certain length is a policy. Whether this policy is good or bad (fwiw I'd say it is good as it defaults to safe) is less important than the fact that it is a Twisted specific opinion. With any policy decision there are at least two options:
To that end, I'd propose one or both of two paths:
That said, #2 has implications outside the scope of this ticket and deserves its own ticket/design discussion, so I'd vote for option 1 to address this ticket. Behavior doesn't change, but at least a user knows why. Thoughts? -- Note that I'm willing to put the work in for option 1, I'm not just throwing it out there, but I'd prefer to know there was some agreement with the work. |
I am happy to review option #11338. I remember that I look at passing a message via the exception handling, and it was not straightforward with the LineReceiver will just close the connection without any message and the loseConnection does not accept an argument https://github.com/twisted/twisted/blob/twisted-17.9.0/src/twisted/protocols/basic.py#L638 One way in which I was trying to fix it is to pass the reference to protocol to the HTTPClientParse so that the parser will set a I saw this pattern in some other places in Twisted, but I would have preferred to be able to transmit the reason just via an exception For option #2. If you want the IAgent way, I guess that for headers lenght you will have to go to Agent -> HTTPConnectonPool -> HTTPClientFactory -> HTTP11ClientProtocol -> HTTPClientParser But just to get headers length, maybe is easier to pass the options via the request object https://github.com/twisted/twisted/blob/twisted-17.9.0/src/twisted/web/_newclient.py#L1430 If you have a real need for any of the above options, I am happy to review the work. Otherwise I also think that this should be closed as won't fix. If in the future someone while have a real need for this, a new ticket and PR can be created and we can review that work. |
Just an update to this one. With the recent Twisted version, the error is now:
Which is expected... when the line is too long, for now Twisted just silently closed the connection. I think is best to keep this open and close it once we can have something like
|
@adiroiban thanks for tracking this |
In Scrapy we have an old issue about this which comes back now and again as people are affected while crawling specific websites. In scrapy/scrapy#5911 I tried to address this with the public API, but things got too messy, and eventually went with monkey-patching. Before we merge that change, we would like to know if you are open to either increasing the default maximum length (e.g from 2**14 to 2**16), or option 2 as discussed here, i.e. provide a public API to change this value. |
I don't know what to say about At http://browsercookielimits.iain.guru/ I see the limit is 4KB per header and the RFC https://datatracker.ietf.org/doc/html/rfc6265#section-6.1 I only see minimum limits. |
Those limits are specifically about the
|
I expect that browser have some limits... they just might be very big ... otherwise you can end up with out of memory errors. Here are the current limits
twisted/src/twisted/web/http.py Lines 2216 to 2217 in c0c6440
we also have maximum line length , that can be increased for HTTP. twisted/src/twisted/protocols/basic.py Line 509 in c0c6440
Right now, both server and client HTTP part share the same HTTP headers parsing code. I think that we can increase the limits. I would go with a conservative value of
What do you think ? |
Sounds good to me. |
Adrian, if you have the time for a PR, I am happy to review it. |
If Twisted Agent receives HTTP header with value that exceeds size limits defined in protocol it will fail and crash.
Expected behavior: if server sends incorrect header client should not fail but simply ignore this header.
To reproduce, create following Twisted server sending header with 100 000 characters in value:
Twisted client trying to reach above server:
Output is:
Searchable metadata
The text was updated successfully, but these errors were encountered: