-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework HTTP message forwarning #1955
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need several small fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just couple of comments
4933987
to
97db973
Compare
76aeaa3
to
2ce9209
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request is an impressive amount of work, but I think there are still things to optimize, maybe we just need some discussions - please have a look onto my comments and let's discuss them.
I reviewed https://tempesta-tech.com/knowledge-base/Modify-HTTP-Messages/ regarding headers appending and it seems now we don't have ,
-appending and there is no actual logic in the code for appending. Again - let's discuss.
fw/cache.c
Outdated
const bool h2 = TFW_MSG_H2(req), tls = TFW_CONN_TLS(req->conn); | ||
const bool sh_frag = h2 ? false : (true & tls); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this the same as
const bool h2 = TFW_MSG_H2(req), tls = TFW_CONN_TLS(req->conn); | |
const bool sh_frag = h2 ? false : (true & tls); | |
const bool sh_frag = !TFW_MSG_H2(req) && TFW_CONN_TLS(req->conn); |
?
|
||
it->hdrs_len += h_app.len; | ||
return tfw_strcat(req->pool, orig_hdr, &h_app); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now append
is used for the -ENOENT
check only, i.e. it does not affect any logic, so __h2_req_hdrs()
should not accept append
.
Moreover, now it seems nobody actually use TfwHdrModsDesc->append
, so the whole logic in http.c and http_msg.c could be simplified by removing the flag processing. tfw_http_msg_hdr_xfrm()
is also called in only one place with append = 0
.
In my understanding now the wiki page is incorrect in (@RomanBelozerov could you also please confirm with tests?): for resp_hdr_add Cache-Control "no-cache"
it seems now we get
Cache-Control: no-store
Cache-Control: no-cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking append before calling __h2_req_hdrs()
is reasonable and I will do it. But I can't remove TfwHdrModsDesc->append
, this flag indicates type of operation: resp_hdr_add or resp_header_set. If append == 0
we use resp_header_set otherwise resp_hdr_add. append
means: Header must be appended to message. We can choose another name for the flag if it confusing.
tfw_http_msg_hdr_xfrm_str()
has a little bit of garbage code, I'll rewrite it.
const TfwStr *c, *end; | ||
unsigned int room, skb_room, n_copy, rlen, off, acc = 0; | ||
TfwMsgIter *it = &hm->iter; | ||
TfwPool* pool = hm->pool; | ||
|
||
BUG_ON(it->skb->len > SS_SKB_MAX_DATA_LEN); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the function is called a lot of times, so it should be well optimized. I think it makes sense to make if (unlikely(skb_headlen(it->skb)))
since even if we have a lot of such skbs, then only the first call hits the condition and all following calls go further. The inner if
can eliminate unliekly
- it's no sense to have 2 nested unlikely statements.
Also probably it makes sense to add if (likely(TFW_STR_PLAIN(s)))
to TFW_STR_FOR_EACH_CHUNK_INIT()
and unlikely(room == 0)
in the while
loop at the below.
fw/http_msg.c
Outdated
@@ -1336,32 +1287,30 @@ tfw_http_msg_expand_data(TfwMsgIter *it, struct sk_buff **skb_head, | |||
return 0; | |||
} | |||
|
|||
static int | |||
tfw_http_msg_alloc_from_pool(TfwHttpTransIter *mit, TfwPool* pool, size_t size) | |||
static char * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function returns an address in memory, not an array of chars, so I'd propose to make it void *
fw/http.c
Outdated
if (req->cleanup) { | ||
__tfw_http_msg_cleanup(req->cleanup); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (req->cleanup) { | |
__tfw_http_msg_cleanup(req->cleanup); | |
} | |
if (req->cleanup) | |
__tfw_http_msg_cleanup(req->cleanup); |
fw/http.c
Outdated
req->cleanup = tfw_pool_alloc(hm->pool, sizeof(TfwHttpMsgCleanup)); | ||
if (unlikely(!req->cleanup)) | ||
return -ENOMEM; | ||
memset(req->cleanup, 0, sizeof(TfwHttpMsgCleanup)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have pages_sz
to check whether we have any pages in the data structures, so why do we do memset()
instead of just to zeroize pages_sz
and maybe skb_head
?
if (!cache) | ||
r = tfw_http_msg_expand_from_pool((TfwHttpMsg *)resp, &hdr); | ||
else | ||
r = tfw_http_msg_expand_data(&resp->iter, skb_head, &hdr, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to add a TODO #634
comment. The function is relatively heavyweight operating with skbs and probably it makes sense to reconstruct all the headers from the cache also into TfwPool allocated area.
Please review #634 (comment) and the whole issue - maybe you have more thoughts after the pull request.
} | ||
|
||
if (tfw_http_hdr_sub(hid, hdr, h_mods)) | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we make the headers substitution here? The function is just traverses the list of the headers to be substituted, but doesn't do anything useful. The same for tfw_http_adjust_resp()
. If it looks too big, then let's add a comment in the code and in #634 to do it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Headers substitution splitted into two parts: first tfw_http_hdr_sub()
where we just skip header and don't add it to message. Second tfw_h1_add_loc_hdrs()
where we adding all headers. We don't do it in tfw_http_hdr_sub()
because in this case we need add additional if condition into tfw_h1_add_loc_hdrs()
to skip headers if append == 0
- resp/req_header_set case.
Name tfw_http_hdr_sub()
looks little bit confusing, I will rename it.
even if first had space. We should allocate another skb only for tls traffic, because plaint http is zero copy. However, for tls headers can be encrypted inplace, but body not, thus we allocate new skb to avoid copy. There is some performance gain for plain http. Current master: wrk -d 10 -c 1000 -t 2 http://debian Running 10s test @ http://debian 2 threads and 1000 connections Thread Stats Avg Stdev Max +/- Stdev Latency 13.18ms 7.44ms 49.58ms 83.85% Req/Sec 22.40k 1.09k 24.85k 75.52% 444655 requests in 10.04s, 35.54GB read Requests/sec: 44283.16 Transfer/sec: 3.54GB Current commit: wrk -d 10 -c 1000 -t 2 http://debian Running 10s test @ http://debian 2 threads and 1000 connections Thread Stats Avg Stdev Max +/- Stdev Latency 10.16ms 611.25us 22.86ms 87.25% Req/Sec 25.45k 2.19k 29.13k 64.58% 507000 requests in 10.08s, 40.52GB read Requests/sec: 50316.61 Transfer/sec: 4.02GB
The main goal of this patch is get rid off skb fragmentation that cause errors during message adjusting. Because `TfwStr` stores pointer to skb that owns data we have problems when that data moves to another skb, it happens when we add data to highly fragmented skb and last fragments move to another skb, that breaks `skb` field of `TfwStr` which stores pointer to moved data. I have decided to mirgate to approach that same with used in HTTP2 message transformation. Unlink skbs that contains headers, except the last one skb, this skb bocomes new response `head`, then copy necessery headers to `head` using `TfwPool`. During copy adjust headers. Response for PURGE request: Now we just copy response headers to new skb skipping response body if it exists, otherwise we just remove headers from current skb using `tfw_http_msg_cutoff_headers()` and use it as skb_head. This approach used because `tfw_http_msg_cutoff_headers()` already is complex adding more logic to this function overcomplicate its. Also this function used for both HTTP1 and HTTP2 protocols if we add logic related to dropping the body of "PURGE responses" it will make an overhead for each HTTP2 response.
`TfwMsgIter Iter` moved from TfwHttpTransIter to TfwHttpMsg because it will be used in response/request transformation and adjusting.
Same as response patch. `TfwHttpMsgCleanup *cleanup` added to TfwHttpReq to have ability to cleanup request on destruction. In future `old_head` must be replaced with `cleanup`.
`tfw_http_msg_expand_from_pool` splitted into two functions first `tfw_http_msg_expand_from_pool` for HTTP1 messages and second `tfw_h2_msg_expand_from_pool` for HTTP2 messages.
Behaviour changed accordingly with new RFC. RFC 9112 Section 3.2.2: "When a proxy receives a request with an absolute-form of request-target, the proxy MUST ignore the received Host header field (if any) and instead replace it with the host information of the request-target. A proxy that forwards such a request MUST generate a new Host field value based on the received request-target rather than forward the received Host field value"
Also fix comments.
Parse HTTP 1.1 method and store it in `msg->h_tbl[TFW_HTTP_METHOD]`. Method will be used during http request forwarding to rebuild the request.
An ability to append values to headers was removed for HTTP2 requests as well as earlier for whole HTTP1 and HTTP2 responses. It was removed because we can't control correctness of appending accordingly to RFC. Values could be appended to headers that doesn't have comma-separated list syntax.
`curr_ptr` is not used because we expanding the last fragment of skb from pool or adding new fragment from pool as well.
TFW_HTTP_B_NEED_STRIP_LEADING_LF and TFW_HTTP_B_NEED_STRIP_LEADING_CR was used during HTTP1 request forwarding to cutoff first CR LF in skb.
AUTO_SEGS_N not changed after #1103 fix, because most of responses has large size and with this value (8) we can cover higher percentage of responses
Now we use cleanup field instead of `old_head`, cleanup also used during http1 request adjusting.
Allowed to use unknown method for headers that stored in dynamic table. Before if tempesta received such request it was dropped. Unknown method is method that processed as _TFW_HTTP_METH_UNKNOWN.
`tfw_http_add_hdr_clen()` replaced by new function `tfw_http_resp_term_add_hdr_clen()`. Now it constructs content-length header into headers table, without skb modification as it done before. The headers copies to response in the tfw_http_adjust_resp() along with other headers. `tfw_http_msg_hdr_xfrm()` - deleted, because it's un- used. Some of functions that were used only by `tfw_http_msg_hdr_xfrm()/tfw_http_msg_hdr_xfrm_str()` not deleted, because we consider them as library functions. For instance: `tfw_strcpy_comp_ext()`. `ss_skb_get_room()` could be lib function, but it's just a wrapper for `ss_skb_get_room_w_frag()`. `__hdr_is_singular()` was used during adding headers configured by administrator, but we treat adding singular RAW headers via `resp_hdr_add` as misconfiguration, there is no sense to check it for each header.
HTTP part of #1103, HTTP2 part will be implemented in separated PR.