Teach http::client::make_request() to timeout #2304

xemul · 2024-06-26T16:23:29Z

This PR adds the send_timeout parameter to client::make_request() method. When set, client will not send request after the given time-point instead resolving with timeout_error. This parameter does not affect the time client waits for the server to respond, it only makes sure that the request is sent within given time, and everything that happens after it can last as long as it want (we can add reply_timeout later if we need).

Now the justification for this timeout. The make_request() method gets request parameter by value and requires is to be fully set including query arguments, headers and body writer (or body itself). Before sending the request client may wait for a connection to become available -- it can be a newly established connection, or some preexisting connection that's now serving some other request. This wait may happen to be "too long" in some cases.

One of the example of what "too long" means is S3 requests signing. Signed requests include, among other headers, the one with date and time when the request was signed and server imposes 15 minutes timeout on it. If a request is signed but arrives later than that duration, it's rejected. 15 minutes seem to be long enough, but sometimes it's not.

A very rare, but still real-life case -- each request establishes new TCP connection because server actively closes the existing ones. TCP connect() may take more than 1 minute, so 15+ requests in the queue can be enough to invalidate the signature of the tail ones.

Another risky case is Scylla uploading sstables. It happens in 5Mb chunk and sending 5Mb body to real S3 sometimes takes seconds and increases the chance to get connection closed by server. With high parallelism it again may invalidate the signature of last requests in the queue.

This PR allows sender to bail out making request early enough to let caller re-sign request and re-send it again.

closes: #1845 (it was another attempt to overcome this, but this solution looks more generic)

And propagate it down the call stack to the interesting places. Next patch will make the code obey the provided timeout. Signed-off-by: Pavel Emelyanov <[email protected]>

xemul · 2024-06-26T17:29:43Z

upd:

fix test leaking a future (and failing debug run)

There are two places that can wait for undefined amount of time -- the one that waits the connection pool to release a connection and the one that establishes a new connection. The former is simple, as it uses conditional variable, so just wait() on it with the timeout and return back timeout error if it fired. Tha latter is trickier. New connections come from factory and it's not guaranteed that a factory obeys it (e.g. see scylladb#2302). So instead of relying on the factory, check if connected_socket appeared soon enough and return back timeout error if it didn't. Signed-off-by: Pavel Emelyanov <[email protected]>

There are several tests that call client::make_request() with some simple pre-defined request that's supposed to fail. Next patches will want to do the same, so prepare the helper function in advance. Signed-off-by: Pavel Emelyanov <[email protected]>

Two cases to check -- that timing out factory is handled and that timing out pooled connection is handled. Signed-off-by: Pavel Emelyanov <[email protected]>

xemul · 2024-06-26T18:13:21Z

upd:

fix (maybe, I cannot run this build locally) --with-modules compilation

nyh · 2024-06-27T07:41:10Z

A very rare, but still real-life case -- each request establishes new TCP connection because server actively closes the existing ones. TCP connect() may take more than 1 minute, so 15+ requests in the queue can be enough to invalidate the signature of the tail ones.

I don't understand this scenario, at all all :-( First, why would TCP connect() take "more than 1 minute"? This is not a "real life case". Second, if you need to make 15 requests why would you do them one after another instead of in parallel? I remember when NCSA Mosaic introduced the feature of sending multiple requests (I think it was 4?) to the same server instead of sending them one by one. I think it was 1994 :-)

I think these hypothetical S3 examples are not helping convince reviewers like me that this (and similar) patches are worthwhile. First, because Seastar HTTP client is not an S3 client specificially. Second, because you aren't explaining why these hypothetical examples actually happen in S3 (or any other HTTP server).

I think a more convincing argument would be that TCP connect() is not inherently limited in time, so it can take a millisecond or retry for 24 hours - which is why every connect() implementation needs the ability to specify a timeout. So Seastar too needs a connect_timeout option too.

It makes sense for a connect_timeout to be separate than a total timeout, because the total timeout will usually be proportional to the amount of data and to the network's bandwidth (you can't expect downloading 1GB over a modem to take the same amount of time as downloading 1KB over LAN), but the connect_timeout is much less reliant on these things and can usually be set to a constant.

Another risky case is Scylla uploading sstables. It happens in 5Mb chunk and sending 5Mb body to real S3 sometimes takes seconds and increases the chance to get connection closed by server. With high parallelism it again may invalidate the signature of last requests in the queue.

Again, the example doesn't make sense to me. If the caller has "high parallelism", they expect all these requests to be done in parallel, not be queued. If complete requests are queued for more than 15 minutes, where do we get the memory for this huge queue?

Anyway, as I said above, I agree that a "connection_timeout" option for a function making an HTTP request makes a lot of sense. I'm just not convinced by the specific examples :-)

xemul · 2024-06-27T08:17:03Z

First, why would TCP connect() take "more than 1 minute"?

IIRC with default Linux sysctls SYN-s are sent with 4x linear timeouts of 1s plus 6x exponential timeouts which gives 4 + 64s = 68s to connect.

Second, if you need to make 15 requests why would you do them one after another instead of in parallel?

Yes, you're right, we can do them in parallel. I wasn't very clear, let me rephrase.

Client has "max_connections" parameter which user can configure, but it's not guaranteed that the real parallelism will be less-or-equal to that number. I think that's the key point here.

E.g. if max connections is, say 10, and client is asked to do 20 requests, then 10 will be executing and 10 will be in the queue. So my point is -- if it happens that we have enough requests in the queue and those running in parallel are slow enough, then the last one times out. 15 requests in queue and the remaining doing slow connect is just an example. Another example was below -- some requests in queue and the remaining ones doing slow read from S3.

I think a more convincing argument would be that TCP connect() is not inherently limited in time, so it can take a millisecond or retry for 24 hours - which is why every connect() implementation needs the ability to specify a timeout.

Erm... Default Linux sysctl configuration doesn't let connect() hang without end. What do you mean by "is not inherently limited in time"?

If the caller has "high parallelism", they expect all these requests to be done in parallel, not be queued.

Well, it might expect that, but given clients connection pool is limited, it's not guaranteed that all the requests run in parallel. Some will be queued waiting for any connection to become free.

If complete requests are queued for more than 15 minutes, where do we get the memory for this huge queue?

It's not that much memory. E.g. if you do a lot of GET-s to serve sstables read from S3 storage, then each GET request is IIRC ~100 bytes, so even 100 requests in the queue is just 10k.

nyh

In general, I'm in favor of the feature you are adding, but I have many comments about the implementation that I think needs to be fixed.

nyh · 2024-06-27T07:44:36Z

include/seastar/http/client.hh

@@ -275,12 +277,13 @@ public:
 * \param req -- request to be sent
 * \param handle -- the response handler
 * \param expected -- the optional expected reply status code, default is std::nullopt
+ * \param send_timeout -- time point at which make_request will stop waiting for transport


We need to be clearer and less fuzzy what this timeout limits. I would call it a "connect timeout" (not "send_timeout", I think this name is wrong). I would explicitly say it is a timeout for making the initial connection but establishing a connection, it does not limit the time the request takes - and name this variable "connect_timeout".

I'm fine to rename it, but I'd like to emphasize that connect timeout doesn't fit well either. As you noticed below, this timeout also applies to the cond. variable wait() which waits for existing connection (from pool) to become available, and it doesn't imply connecting by any means. From the code perspective, this timeout separates the time client spends obtaining a connection (new one or from poll) and the process of sending request head. So this is sort of "request head sending timeout", but I've no better idea for its name that that.

It is a timeout waiting for a connection - not necessarily a new connection!

This distinction can be explained in text. But the name "send_timeout" suggests it's a timeout until the entire request is sent. This is not what you are doing. Moreover, if the request is very large (e.g., uploading a file), it is very far from what you're doing.

It is a timeout waiting for a connection - not necessarily a new connection!

🤣 that's 100% perfect explanation, all the more so the object that client waits for is indeed class connection. I'll rename

nyh · 2024-06-27T07:45:54Z

src/http/client.cc

@@ -252,15 +252,15 @@ future<client::connection_ptr> client::get_connection() {
 }

 if (_nr_connections >= _max_connections) {
- return _wait_con.wait().then([this] {
- return get_connection();
+ return _wait_con.wait().then([this, timeout] {


Why doesn't _wait_con.wait() also need to obey "timeout"?

In fact, you explicitly gave this example in your commit message: Some highly parallel application where requests sit in a queue for 15 minutes before they even start, i.e. - before get_connect() finishes waiting for the connection. So we need to add the 15 minute limit also on this wait.

Oh, I see you fix this in the next patch. Good.
For me, this sort of patch splitting (with no hint that you did that in the commit message) just makes the reviewer harder.

I disagree with "with no hint" credit :D here's the phrase from this commit description:

Next patch will make the code obey the provided timeout.

I didn't understand this. I thought I saw some of the things already obeying the timeout.

nyh · 2024-06-27T07:58:04Z

include/seastar/http/exception.hh

+ * Client-side exception to report that making request timed out
+ * prior to communicating to server
+ */
+class timeout_error : public base_exception {


We already have a class with exactly the same name (but different definition) in rpc_types.hh, and the very similar name timed_out_error in include/seastar/core/timed_out_error.hh. All this is very confusing for users of the library.

I think the name of the HTTP exception type should closely follow the name of the HTTP status it sets. I would call it request_timeout_exception. Similar to the not_found_exception and bad_request_exception we already have.

My bad, I looked at the last exception in this header which is unex pected_status_error and named the new class with _error suffix. Will rename.

nyh · 2024-06-27T08:02:41Z

include/seastar/http/client.hh

@@ -169,6 +170,7 @@ class client {
 public:
 using reply_handler = noncopyable_function<future<>(const reply&, input_stream<char>&& body)>;
 using retry_requests = bool_class<struct retry_requests_tag>;
+ using clock_type = lowres_clock;


I don't like this name "clock_type". It's very unclear what this clock is used for. Future developers may start using it for the wrong things (e.g., if you want to measure a latency histogram, you may not want to use lowres_clock to do this).
I think you should name it "timeout_clock" instead.

Fair point. I followed the rpc_clock_type "naming convention" for no real reason

nyh · 2024-06-27T08:06:16Z

src/http/client.cc

+ return put_connection(std::move(con)).then_wrapped([] (auto f) {
+ f.ignore_ready_future();
+ return make_exception_future<connection_ptr>(httpd::timeout_error());
+ });


I don't understand why we need this super-rare special case. If you set a 1 minute connection timeout, and the connection was made in 59.999 seconds but then when you want to start sending it's already 60.001 seconds - why do we need to treat it specially? Remember that if we start sending in 59.999, then the timeout is no longer relevant - we could be sending 1 TB of data and the whole thing will succeed.

So I would simply remove this piece of code. If the make_connection() said it finished by the timeout, that's fine - we don't need to second-guess its decision here.

Factory doesn't know this deadline and it can generate new connection in much more than 59.999 seconds easily and there are two reasons for that.

First -- the call to make new connection can happen after client tried serving request over polled one (#1883). E.g. -- timeout is 1 minute, client waits 30 seconds for pooled connection, then starts sending data over it and in 15 seconds server closes its end. Client asks for a new connection and it takes 1 minute to establish it. Boom -- 105 seconds had passed.

Second -- we may equip factory::make() with timeout parameter too, but connect() is not always abortable nowadays (#2302). It's a BUG, but still http client has means to handle it.

The example you gave looks wrong to me. Remember that the timeout is an absolute time, not "1 minute". So if the client already waited for 30 seconds for a pool connection and then asks for a new connection, this new connection will have the same deadline and will only have 30 more seconds to complete the connection, not one minute.

I don't know what you mean that connect() is not abortable - you mean it doesn't obey timeout? Then what's the point of this whole patch? (I also asked this regarding the test!).

Ok, I'm starting to think that this specific if() that I didn't like is actually the main part of your patch. Apparently, you really care if the connection attempt takes more than the given amount of time (and you even said in reponses to my questions, but not in the cover letter, that you can't control the timeout in a real system connect()). All you care is that if the connection attempt takes more than the given timeout - the request will not be sent because it is already considered stale.

So this patch is apparently about not sending stale requests, more about the ability to time out. I now understand that you consider it fine to wait 10 minutes instead of the 1 minute timeout, as long as after that 10 minutes we don't send the request.

Yes. The thing is that client doesn't establishes new connections itself, it calls connection_factory::make() and doesn't have any control over it. Of course, we can equip connection_factory::make() with timeout argument and assume that it doesn't ignore it (the whole client is still experimental, so we're relatively safe in changing the API like that), but should client fulfill its API contract and not send a request even if factory didn't fulfill its and missed the timeout?

nyh · 2024-06-27T08:26:42Z

tests/unit/httpd_test.cc

+ return seastar::async([sk = std::move(ar.connection)] () mutable {
+ input_stream<char> in = sk.input();
+ read_simple_http_request(in);
+ sleep(std::chrono::seconds(1)).get();


Nitpick: Although "1 second should be enough for everyone" we always regret this later... Can this be some sort of condition variable or something, to tell this server to just hang forever until released and not for 1 second?

Yes, good point. Maybe in this place we can do it.

nyh · 2024-06-27T08:28:44Z

tests/unit/httpd_test.cc

+ // ... this should hang waiting for cached connection
+ auto f2 = cln.make_request(http::request::make("GET", "test", "/test"), [] (const auto& rep, auto&& in) {
+ return make_exception_future<>(std::runtime_error("Shouldn't happen"));
+ }, http::reply::status_type::ok, http::experimental::client::clock_type::now() + std::chrono::milliseconds(100));


Again, you are assuming that if the server hangs the previous request for 1 second, then a timeout of 100ms will not be enough to make a new connection. But this is the stuff flaky tests are made of. Would be better to use a condition variable instead of these numbers, or if it's not possible, I think we need this to be more than 1 second sleep.

Client doesn't attempt to make new connection, the max_connections here is 1, so it's waiting for the previous one to return to pool. I think you're right and we can go with explicit server kick.

nyh · 2024-06-27T08:30:01Z

tests/unit/httpd_test.cc

+ server_started.get_future().get();
+ // ... this should hang waiting for cached connection
+ auto f2 = cln.make_request(http::request::make("GET", "test", "/test"), [] (const auto& rep, auto&& in) {
+ return make_exception_future<>(std::runtime_error("Shouldn't happen"));


Again, I don't understand why this "shouldn't happen" thing is needed. BOOST_REQUIRE_EXCEPTION already requires an exception, and if there is no exception, it will fail. So you don't need to replace the no exception by a silly "shouldn't happen" exception.

It's just because I need to return some future from this lambda. It could as well be make_ready_future<>(). On the other hand, if something goes wrong and server sends the response and this place gets erroneously called ... I don't know, exceptional future from here is a fuse against silly mistakes in the future. Methinks

nyh · 2024-06-27T08:37:55Z

tests/unit/httpd_test.cc

@@ -954,6 +954,74 @@ SEASTAR_TEST_CASE(test_client_response_parse_error) {
 });
 }

+SEASTAR_TEST_CASE(test_client_request_send_timeout_cached_conn) {
+ return seastar::async([] {
+ loopback_connection_factory lcf(1);


It took me a while to understand why this test is about a "cached connection".
Now I understand: The difference between this test and the next test is that this test waits for the existing requests to finish and release the concurrency limit - i.e., it waits on a condvar - while the next test really starts a TCP connect() and waits on that.

nyh · 2024-06-27T08:41:06Z

tests/unit/httpd_test.cc

+ explicit delayed_factory(loopback_connection_factory& f) : loopback_http_factory(f) {}
+ virtual future<connected_socket> make() override {
+ return sleep(std::chrono::seconds(1)).then([this] {
+ return loopback_http_factory::make();


If I understand correctly, what you're doing here isn't causing the TCP connect to timeout, but for the connection factor to add an artificial delay. So you are NOT testing the ability of this code to timeout on a very long connect() attempt, but rather testing the extra "check now() after the factory returned" which I said in the previous test you should remove.
So I think this test is wrong and you are not testing the really important feature - the ability to break a TCP connect().

Yes, that's correct, but this patch is not about breaking TCP connect() , because it's not always working (#2302), and client doesn't mess with it at all, it uses external factory to generate new connections.

... and tests here use test-only loopback sockets which are not TCP, but are in-memory shared queue behind connected_socket API, they don't have connect() timeout/abort. Even if they had, it wouldn't be TCP connect()

While we're here, @nyh , I invite you to review a test that checks the ability to break TCP connect() -- #2301

@xemul I thought the whole purpose of this patch is to allow timing out connect() (either an actual Linux connect() or something from a pool). I'm starting to understand this was not the point of this PR.

http/client: Add timeout argument to make_request()

5b3270e

And propagate it down the call stack to the interesting places. Next patch will make the code obey the provided timeout. Signed-off-by: Pavel Emelyanov <[email protected]>

xemul requested a review from nyh June 26, 2024 16:23

xemul force-pushed the br-http-client-timeout-make-request branch from f9aa0e3 to 90bad1f Compare June 26, 2024 17:29

xemul added 3 commits June 26, 2024 21:11

test/http: Test how client::make_request() handles provided timeout

8eedc90

Two cases to check -- that timing out factory is handled and that timing out pooled connection is handled. Signed-off-by: Pavel Emelyanov <[email protected]>

xemul force-pushed the br-http-client-timeout-make-request branch from 90bad1f to 8eedc90 Compare June 26, 2024 18:12

nyh requested changes Jun 27, 2024

View reviewed changes

This was referenced Aug 27, 2024

Make http::client::make_request() abortable #2409

Closed

Make http::client::make_request() abortable #2410

Merged

avikivity closed this in 2c1da84 Sep 11, 2024

Teach http::client::make_request() to timeout #2304

Teach http::client::make_request() to timeout #2304

Conversation

xemul commented Jun 26, 2024

xemul commented Jun 26, 2024

xemul commented Jun 26, 2024

nyh commented Jun 27, 2024

xemul commented Jun 27, 2024

nyh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment