Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework HTTP message forwarning #1955

Draft
wants to merge 24 commits into
base: master
Choose a base branch
from
Draft

Rework HTTP message forwarning #1955

wants to merge 24 commits into from

Conversation

const-t
Copy link
Contributor

@const-t const-t commented Aug 16, 2023

HTTP part of #1103, HTTP2 part will be implemented in separated PR.

fw/http.h Outdated Show resolved Hide resolved
@const-t const-t marked this pull request as ready for review August 16, 2023 12:47
@const-t const-t marked this pull request as draft August 16, 2023 12:49
fw/http.h Outdated Show resolved Hide resolved
@const-t const-t marked this pull request as ready for review August 16, 2023 14:32
fw/http_msg.c Outdated Show resolved Hide resolved
fw/http_sess.c Outdated Show resolved Hide resolved
fw/http_sess.c Outdated Show resolved Hide resolved
fw/http.c Outdated Show resolved Hide resolved
fw/http_msg.c Outdated Show resolved Hide resolved
Copy link
Contributor

@EvgeniiMekhanik EvgeniiMekhanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need several small fixes

fw/http.c Outdated Show resolved Hide resolved
Copy link
Contributor

@krizhanovsky krizhanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just couple of comments

fw/http.h Outdated Show resolved Hide resolved
fw/http.h Outdated Show resolved Hide resolved
fw/http.c Outdated Show resolved Hide resolved
fw/http.c Show resolved Hide resolved
fw/http.c Outdated Show resolved Hide resolved
fw/http.c Show resolved Hide resolved
Copy link
Contributor

@krizhanovsky krizhanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request is an impressive amount of work, but I think there are still things to optimize, maybe we just need some discussions - please have a look onto my comments and let's discuss them.

I reviewed https://tempesta-tech.com/knowledge-base/Modify-HTTP-Messages/ regarding headers appending and it seems now we don't have ,-appending and there is no actual logic in the code for appending. Again - let's discuss.

fw/cache.c Outdated
Comment on lines 2583 to 2584
const bool h2 = TFW_MSG_H2(req), tls = TFW_CONN_TLS(req->conn);
const bool sh_frag = h2 ? false : (true & tls);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the same as

Suggested change
const bool h2 = TFW_MSG_H2(req), tls = TFW_CONN_TLS(req->conn);
const bool sh_frag = h2 ? false : (true & tls);
const bool sh_frag = !TFW_MSG_H2(req) && TFW_CONN_TLS(req->conn);

?

fw/http_parser.c Show resolved Hide resolved

it->hdrs_len += h_app.len;
return tfw_strcat(req->pool, orig_hdr, &h_app);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now append is used for the -ENOENT check only, i.e. it does not affect any logic, so __h2_req_hdrs() should not accept append.

Moreover, now it seems nobody actually use TfwHdrModsDesc->append, so the whole logic in http.c and http_msg.c could be simplified by removing the flag processing. tfw_http_msg_hdr_xfrm() is also called in only one place with append = 0.

In my understanding now the wiki page is incorrect in (@RomanBelozerov could you also please confirm with tests?): for resp_hdr_add Cache-Control "no-cache" it seems now we get

Cache-Control: no-store
Cache-Control: no-cache

Copy link
Contributor Author

@const-t const-t May 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking append before calling __h2_req_hdrs() is reasonable and I will do it. But I can't remove TfwHdrModsDesc->append, this flag indicates type of operation: resp_hdr_add or resp_header_set. If append == 0 we use resp_header_set otherwise resp_hdr_add. append means: Header must be appended to message. We can choose another name for the flag if it confusing.

tfw_http_msg_hdr_xfrm_str() has a little bit of garbage code, I'll rewrite it.

fw/http_msg.h Outdated Show resolved Hide resolved
const TfwStr *c, *end;
unsigned int room, skb_room, n_copy, rlen, off, acc = 0;
TfwMsgIter *it = &hm->iter;
TfwPool* pool = hm->pool;

BUG_ON(it->skb->len > SS_SKB_MAX_DATA_LEN);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the function is called a lot of times, so it should be well optimized. I think it makes sense to make if (unlikely(skb_headlen(it->skb))) since even if we have a lot of such skbs, then only the first call hits the condition and all following calls go further. The inner if can eliminate unliekly - it's no sense to have 2 nested unlikely statements.

Also probably it makes sense to add if (likely(TFW_STR_PLAIN(s))) to TFW_STR_FOR_EACH_CHUNK_INIT() and unlikely(room == 0) in the while loop at the below.

fw/http_msg.c Outdated
@@ -1336,32 +1287,30 @@ tfw_http_msg_expand_data(TfwMsgIter *it, struct sk_buff **skb_head,
return 0;
}

static int
tfw_http_msg_alloc_from_pool(TfwHttpTransIter *mit, TfwPool* pool, size_t size)
static char *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function returns an address in memory, not an array of chars, so I'd propose to make it void *

fw/http.c Outdated
Comment on lines 2532 to 2534
if (req->cleanup) {
__tfw_http_msg_cleanup(req->cleanup);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (req->cleanup) {
__tfw_http_msg_cleanup(req->cleanup);
}
if (req->cleanup)
__tfw_http_msg_cleanup(req->cleanup);

fw/http.c Outdated
req->cleanup = tfw_pool_alloc(hm->pool, sizeof(TfwHttpMsgCleanup));
if (unlikely(!req->cleanup))
return -ENOMEM;
memset(req->cleanup, 0, sizeof(TfwHttpMsgCleanup));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have pages_sz to check whether we have any pages in the data structures, so why do we do memset() instead of just to zeroize pages_sz and maybe skb_head?

if (!cache)
r = tfw_http_msg_expand_from_pool((TfwHttpMsg *)resp, &hdr);
else
r = tfw_http_msg_expand_data(&resp->iter, skb_head, &hdr, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to add a TODO #634 comment. The function is relatively heavyweight operating with skbs and probably it makes sense to reconstruct all the headers from the cache also into TfwPool allocated area.

Please review #634 (comment) and the whole issue - maybe you have more thoughts after the pull request.

}

if (tfw_http_hdr_sub(hid, hdr, h_mods))
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we make the headers substitution here? The function is just traverses the list of the headers to be substituted, but doesn't do anything useful. The same for tfw_http_adjust_resp(). If it looks too big, then let's add a comment in the code and in #634 to do it there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Headers substitution splitted into two parts: first tfw_http_hdr_sub() where we just skip header and don't add it to message. Second tfw_h1_add_loc_hdrs() where we adding all headers. We don't do it in tfw_http_hdr_sub() because in this case we need add additional if condition into tfw_h1_add_loc_hdrs() to skip headers if append == 0 - resp/req_header_set case.

Name tfw_http_hdr_sub() looks little bit confusing, I will rename it.

@const-t const-t marked this pull request as draft April 9, 2024 07:16
even if first had space. We should allocate
another skb only for tls traffic, because
plaint http is zero copy. However, for tls
headers can be encrypted inplace, but body
not, thus we allocate new skb to avoid copy.

There is some performance gain for plain http.

Current master:
wrk -d 10 -c 1000 -t 2 http://debian
Running 10s test @ http://debian
  2 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    13.18ms    7.44ms  49.58ms   83.85%
    Req/Sec    22.40k     1.09k   24.85k    75.52%
  444655 requests in 10.04s, 35.54GB read
Requests/sec:  44283.16
Transfer/sec:      3.54GB

Current commit:
wrk -d 10 -c 1000 -t 2 http://debian
Running 10s test @ http://debian
  2 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    10.16ms  611.25us  22.86ms   87.25%
    Req/Sec    25.45k     2.19k   29.13k    64.58%
  507000 requests in 10.08s, 40.52GB read
Requests/sec:  50316.61
Transfer/sec:      4.02GB
The main goal of this patch is get rid off skb fragmentation that
cause errors during message adjusting. Because `TfwStr` stores pointer
to skb that owns data we have problems when that data moves to another
skb, it happens when we add data to highly fragmented skb and last
fragments move to another skb, that breaks `skb` field of `TfwStr`
which stores pointer to moved data.

I have decided to mirgate to approach that same with used in HTTP2
message transformation. Unlink skbs that contains headers, except the
last one skb, this skb bocomes new response `head`, then copy
necessery headers to `head` using `TfwPool`. During copy adjust
headers.

Response for PURGE request:

Now we just copy response headers to new skb
skipping response body if it exists, otherwise
we just remove headers from current skb using
`tfw_http_msg_cutoff_headers()` and use it as skb_head.

This approach used because
`tfw_http_msg_cutoff_headers()` already is complex
adding more logic to this function overcomplicate
its. Also this function used for both HTTP1 and HTTP2
protocols if we add logic related to dropping the body
of "PURGE responses" it will make an overhead for
each HTTP2 response.
const-t added 20 commits May 22, 2024 12:35
`TfwMsgIter Iter` moved from TfwHttpTransIter to TfwHttpMsg because it
will be used in response/request transformation and adjusting.
Same as response patch.

`TfwHttpMsgCleanup *cleanup` added to TfwHttpReq to have ability
to cleanup request on destruction. In future `old_head` must be
replaced with `cleanup`.
`tfw_http_msg_expand_from_pool` splitted into two functions
first `tfw_http_msg_expand_from_pool` for HTTP1 messages
and second `tfw_h2_msg_expand_from_pool` for HTTP2 messages.
Behaviour changed accordingly with new RFC.

RFC 9112 Section 3.2.2:

"When a proxy receives a request with an absolute-form of
request-target, the proxy MUST ignore the received Host header field
(if any) and instead replace it with the host information of the
request-target. A proxy that forwards such a request MUST generate a
new Host field value based on the received request-target rather than
forward the received Host field value"
Also fix comments.
Parse HTTP 1.1 method and store it in `msg->h_tbl[TFW_HTTP_METHOD]`.
Method will be used during http request forwarding to rebuild the
request.
An ability to append values to headers was removed for HTTP2 requests
as well as earlier for whole HTTP1 and HTTP2 responses. It was removed
because we can't control correctness of appending accordingly to RFC.
Values could be appended to headers that doesn't have comma-separated
list syntax.
`curr_ptr` is not used because we expanding the last fragment of skb
from pool or adding new fragment from pool as well.
TFW_HTTP_B_NEED_STRIP_LEADING_LF and TFW_HTTP_B_NEED_STRIP_LEADING_CR
was used during HTTP1 request forwarding to cutoff first CR LF in skb.
AUTO_SEGS_N not changed after #1103 fix, because
most of responses has large size and with this
value (8) we can cover higher percentage of
responses
Now we use cleanup field instead of `old_head`, cleanup also used
during http1 request adjusting.
Allowed to use unknown method for headers that stored in dynamic table.
Before if tempesta received such request it was dropped.

Unknown method is method that processed as _TFW_HTTP_METH_UNKNOWN.
`tfw_http_add_hdr_clen()` replaced by new function
`tfw_http_resp_term_add_hdr_clen()`. Now it constructs
content-length header into headers table, without
skb modification as it done before. The headers
copies to response in the tfw_http_adjust_resp() along
with other headers.

`tfw_http_msg_hdr_xfrm()` - deleted, because it's un-
used. Some of functions that were used only by
`tfw_http_msg_hdr_xfrm()/tfw_http_msg_hdr_xfrm_str()`
not deleted, because we consider them as library
functions. For instance: `tfw_strcpy_comp_ext()`.

`ss_skb_get_room()` could be lib function, but it's
just a wrapper for `ss_skb_get_room_w_frag()`.

`__hdr_is_singular()` was used during adding headers
configured by administrator, but we treat adding
singular RAW headers via `resp_hdr_add` as
misconfiguration, there is no sense to check it for
each header.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rewrite proxied Host header when URI comes in the form of absolut-URI resp_hdr_add multiple headers addition
3 participants