-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use-proxy-protocol for SSL passthrough breaks HTTP #11365
Comments
/remove-kind bug The bug-report template is clear in asking for related information which you have skipped. For example why does your issue-description imply that a ssl-passthrough-using-ingress-resources is in the same use-case as a regular NON-ssl-passthrough-using-annotation. It would help to know the details as asked in the new bug-report template because when you have passed through the controller and directly terminated TLS on the backend pod, why does your issue description imply a HTTP/HTTPS termination on the ingress-controller /kind support |
As far as I know, the only thing I skipped was the reproduction instructions, which I planned to write out and provide tomorrow. To answer your questions, I have several Kubernetes clusters whose APIs are behind this ingress controller, which needs SSL passthrough. Separately, I have typical Ingress services, where TLS is terminated on the Ingress controller. I needed to enable SSL passthrough for the K8s APIs, which broke client IP address detection for everything. To fix that, I set the PROXY protocol config options, which fixed the IP address detection but broke HTTP for all Ingress resources. More concisely, configuring the Ingress controller to allow for SSL passthrough breaks HTTP for all Ingress resources, including non-passthrough ones. |
Thanks for the update. Helps. There is almost no data to analyze or guess. And questions are in a new bug report template. Look at it kindly and edit your issue description to answer those. One example of what is required but you skipped is zero context on proxy-protocol. I can see baremetal. And proxy-protocol needs to be enabled on all the hops. So you have not even told us what is the LoadBalancer where you enabled proxy-protocol, above the controller's service of --type LoadBalancer. Another example is template asks With limited info in issue description, you could not be using proxy-protocol as recommended because controller needs protocol enabled on the LB above it as that is the way the clinet-info is retained. An this can not be reproduced in a minikube/kind cluster because metallb can NOT use proxy-protocol and other LBs are not well-known, even if there was one that can run on baremetal and also be configured with proxy-protocol. |
The LB is irrelevant to this - the proxy protocol is being spoken only within the nginx pod, not by the load balancer. I will update with kind reproduction instructions when I can. The LB in use is MetalLB, but again, that's irrelevant as it is not what is speaking the proxy protocol.
Check the "Current State of the controller:" section, which I collapsed with <details> as the issue template requested. |
This comment is what prompted me to configure the proxy protocol: #8052 (comment) |
Thanks. You can review the details collapsed info. I was trying to figure out how you get client info over LB but you see that its all just a long stream of flat unformatted text. If you can fix hte md formatting, maybe it will become legible. I can guess reasons why you say that proxy-protocol on LB above the controller is not relevant. But its not something the project can extensively dig into. So far the docs and the understanding is that the LB above the controller has proxy-protocol enabled as well as the controller has the proxy-protocol enabled. That way the client info is retained across hops. If the LB is a L7 LB, then I think the X-Forwarded headers contain the client info. In any case, will wait for reproduce procedure. But if your reproduce procedure does not simulate a real-world use case like TLS terminating on a LB running L4 proxy-protocol, it will cause a lot of confusion. |
I saw the comment you linked just now. |
Note the mention of the LB above the controller at these links https://kubernetes.github.io/ingress-nginx/user-guide/miscellaneous/#proxy-protocol |
I have updated the issue with a reproduction in kind, without any LB. Apologies about the formatting of the cluster state section, forgot to check that before submitting. It has been updated as well. |
It appears there is still miscommunication here, and I am wondering how the reproduction failed. I have re-run through the reproduction steps, and found:
Below is complete terminal output demonstrating this. 15:04:13 dxs@bravo ~ $ kind create cluster
Creating cluster "kind" ...
✓ Ensuring node image (kindest/node:v1.30.0) 🖼
✓ Preparing nodes 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
Set kubectl context to "kind-kind"
You can now use your cluster with:
kubectl cluster-info --context kind-kind
Not sure what to do next? 😅 Check out https://kind.sigs.k8s.io/docs/user/quick-start/
15:04:31 dxs@bravo ~ $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/baremetal/deploy.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/docs/examples/http-svc.yaml
namespace/ingress-nginx created
serviceaccount/ingress-nginx created
serviceaccount/ingress-nginx-admission created
role.rbac.authorization.k8s.io/ingress-nginx created
role.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrole.rbac.authorization.k8s.io/ingress-nginx created
clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created
rolebinding.rbac.authorization.k8s.io/ingress-nginx created
rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
configmap/ingress-nginx-controller created
service/ingress-nginx-controller created
service/ingress-nginx-controller-admission created
deployment.apps/ingress-nginx-controller created
job.batch/ingress-nginx-admission-create created
job.batch/ingress-nginx-admission-patch created
ingressclass.networking.k8s.io/nginx created
validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created
deployment.apps/http-svc created
service/http-svc created
15:05:16 dxs@bravo ~ $ true # pause here for controller to come up
15:06:37 dxs@bravo ~ $ kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: foo-bar
spec:
ingressClassName: nginx
rules:
- host: foo.bar
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: http-svc
port:
number: 80
EOF
ingress.networking.k8s.io/foo-bar created
15:06:48 dxs@bravo ~ $ POD_NAME=$(k get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx -o NAME|grep controller)
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -IH 'Host: foo.bar' localhost
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -kIH 'Host: foo.bar' https://localhost
HTTP/1.1 200 OK
Date: Fri, 17 May 2024 22:07:07 GMT
Content-Type: text/plain
Connection: keep-alive
HTTP/2 200
date: Fri, 17 May 2024 22:07:07 GMT
content-type: text/plain
strict-transport-security: max-age=31536000; includeSubDomains
15:07:07 dxs@bravo ~ $ kubectl get deployment ingress-nginx-controller -n ingress-nginx -o yaml | yq -Y '.spec.template.spec.containers[0].args += ["--enable-ssl-passthrough"]' | kubectl apply -f -
deployment.apps/ingress-nginx-controller configured
15:07:28 dxs@bravo ~ $ true # pause here for controller to come back up
15:08:08 dxs@bravo ~ $ POD_NAME=$(k get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx -o NAME|grep controller)
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -IH 'Host: foo.bar' localhost
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -kIH 'Host: foo.bar' https://localhost
HTTP/1.1 200 OK
Date: Fri, 17 May 2024 22:08:18 GMT
Content-Type: text/plain
Connection: keep-alive
HTTP/2 200
date: Fri, 17 May 2024 22:08:18 GMT
content-type: text/plain
strict-transport-security: max-age=31536000; includeSubDomains
15:08:18 dxs@bravo ~ $ k -n ingress-nginx rollout undo deployment ingress-nginx-controller
deployment.apps/ingress-nginx-controller rolled back
15:09:54 dxs@bravo ~ $ true # pause here for controller to come back up
15:10:06 dxs@bravo ~ $ kubectl apply -f - <<EOF
apiVersion: v1
data:
forwarded-for-header: proxy_protocol
use-proxy-protocol: "true"
kind: ConfigMap
metadata:
labels:
app.kubernetes.io/component: controller
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
app.kubernetes.io/version: 1.10.1
name: ingress-nginx-controller
namespace: ingress-nginx
EOF
configmap/ingress-nginx-controller configured
15:10:28 dxs@bravo ~ $ POD_NAME=$(k get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx -o NAME|grep controller)
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -IH 'Host: foo.bar' localhost
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -kIH 'Host: foo.bar' https://localhost
curl: (52) Empty reply from server
command terminated with exit code 52
curl: (35) Recv failure: Connection reset by peer
command terminated with exit code 35
15:10:38 dxs@bravo ~ 35 $ kubectl get deployment ingress-nginx-controller -n ingress-nginx -o yaml | yq -Y '.spec.template.spec.containers[0].args += ["--enable-ssl-passthrough"]' | kubectl apply -f -
deployment.apps/ingress-nginx-controller configured
15:10:50 dxs@bravo ~ $ POD_NAME=$(k get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx -o NAME|grep controller)
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -IH 'Host: foo.bar' localhost
kubectl exec -it -n ingress-nginx $POD_NAME -- curl -kIH 'Host: foo.bar' https://localhost
curl: (52) Empty reply from server
command terminated with exit code 52
HTTP/2 200
date: Fri, 17 May 2024 22:11:17 GMT
content-type: text/plain
strict-transport-security: max-age=31536000; includeSubDomains
15:11:17 dxs@bravo ~ $ This leads me to believe that you did not properly enable ssl-passthrough in the test where you received the error on HTTPS. To be clear, this minimal test does not actually involve an Ingress that uses SSL passthrough - it simply involves the controller being configured to allow it. You are correct that there is no TLS section and no default certificate - this causes the controller to use the default fake certificate, as reflected in the following snippet from
This is correct. When one enables SSL passthrough on the controller, the controller sets up a TCP proxy (in the controller itself, not in nginx) which listens on 443, intercepting all connections and making decisions about whether they should be routed to nginx (for ssl termination) or directly to the service (for ssl passthrough). This is the proxy that speaks the PROXY protocol to nginx, which is in this case listening on 442, not 443. The problem is that this interception happens only on 443, not on 80. So when we have proxy-protocol enabled, for HTTPS clients go to the TCP proxy which wraps the connection in proxy protocol before hitting nginx on 442, but for HTTP they go direct to nginx without proxy protocol on 80. |
Thanks.
Generally speaking, It seems we are trying to establish that with proxy-protcol enabled, if any user enables the --ssl-passthrough flag of the controller executable, then the HTTP request fails. But I am personally confused though. Below are the reasons. I can not understand how the use the controller pod itself, as a client, is a valid test and proves what you say. This is because the error message is clearly stating that the header in the proxy-protocol is broken. And all I can assume here is that header is broken because of the following reasons. An expert needs to verify comment though.
Other confusions for me are below.
My opinion of a valid testPhase 1
Phase2
Phase3
I am not sure when I can test like this or even if I can. However these are my opinions and so lets wait for other comments. |
There is a big-factor that the test talks about |
That is irrelevant to this. The use case that this issue is about does not involve an LB that speaks proxy protocol. As stated in my previous comment, the thing speaking proxy protocol is the controller itself, which listens on 443 and proxies to nginx listening on 442. Nginx does not listen on 443 when ssl-passthrough is enabled; the controller tcp proxy (which speaks proxy protocol) does.
Yes, but it is also important to note that the HTTPS request does not fail, indicating that the configuration to use proxy protocol is working.
Correct, because when talking HTTP there is no proxy, whereas when talking HTTPS there is (the TCP proxy on 443). Hence why I want to either be able to enable proxy protocol on HTTPS without enabling it on HTTP, or to have another built-in TCP proxy on 80. The controller itself as a client is a valid test because in the use case there is nothing sitting in front of the controller other than a non-proxy-protocol LB (MetalLB) which has no bearing on this. To reiterate, in my environment, nothing is speaking proxy protocol outside of the nginx ingress controller pod.
Most of these seem to have no bearing on the result. This should be reproducible with a nodeport service. As far as client-info, that is not something I am having an issue with here.
This is the exception to the above statement. The only proxy involved here is the built-in TCP proxy. I do not want to deploy another proxy in front of my ingress controller
The test that you describe here is invalid by nature of having an LB that speaks proxy protocol. This issue does not involve an LB that speaks proxy protocol, and introducing one creates a very different use case. Reproduction instructions should be minimal, and so I did not include any of the complexity of my actual environment that was not required to reproduce the exact same issue. In my actual environment, the network path looks something like this:
Or for HTTP:
The exact same issue appears in the kind environment which I wrote reproduction instructions for, in the test case I provided, even though the path is much more minimal:
Or for HTTP:
Similarly, this is the reason I did not add an ingress using the ssl-passthrough annotation. The factor that introduces the bug is not whether an ssl-passthrough ingress is present, but whether --enable-ssl-passthrough is passed to the controller. If it is not, then nginx listens on 443 and no TCP proxy is started. If it is, then nginx listens on 442 and a TCP proxy is started on 443. See here and here. Feel free to perform my reproduction instructions with any additional ingress resources, they should have no effect on the reproducibility as long as --enable-ssl-passthrough is set and use-proxy-protocol is true. |
Thanks for all the comments.
I will wait for comments from experts. The project ingress-nginx community meetings schedule is here https://github.com/kubernetes/community/tree/master/sig-network . My request is that you kindly join a community-meeting (after the 27th of May, 2024) and explain this issue because the maintainers are on those meetings and they can advise what to do next. |
I think you're misunderstanding still; the issue is that HTTP requests to non-ssl-passthrough ingresses fail if ssl-passthrough is enabled on the controller (regardless of whether there exists an ssl passthrough ingress). If it helps clarify, the way I discovered the bug is as follows:
This is confusing me as well, but I think it's simply that this is a relatively uncommon use case, as it requires all the following factors:
And given there is no documentation on the combination of ssl-passthrough and use-proxy-protocol outside of a Github issue comment, that also seems to indicate that this is a rare edge case.
Yes, with the exception of if there is a proxy-protocol-aware LB in front of ingress-nginx, as that would mask the issue.
So the issue is that setting --enable-ssl-passthrough fundamentally changes how the controller is set up, particularly the data path. Without the flag, the service points directly at nginx for both HTTP and HTTPS. With the flag, the HTTP path remains the same, but the HTTPS path is changed to add the TCP proxy between the service and nginx. This change in the data path is the cause of the issue, hence the lack of a need to create an ssl-passthrough ingress. I will message you on Slack to set up a Jitsi call. |
Hi @virtualdxs , I saw your ping on slack and responded. Can you check please |
Your latest update above is informative. I will need to get precise now and start nitpicking to make progress. These are the factors you listed ;
Do you acknowledge that bullet point 3 and point 5 above are inter-related and have a unique standing ?
In this list of factors, Is the proxy-protocol enabled on the ingress-nginx-controller, implied or not implied ?
Cluster and controller state below HTTP request to a NON-ssl-passthrough ingress shows response code 200 below
HTTPS request to a NON-ssl-passthrough ingress shows response code 200 below
|
You are absolutely correct, and my sincere apologies for neglecting to mention this in the initial post. I do in fact have my environment configured with
Yes, use-proxy-protocol is enabled, because that is required for
No, because the built-in TCP proxy is in front of it and speaks proxy-protocol. |
Some progress on my testing.
|
use-proxy-protocol is necessary because without it, the built-in TCP proxy masks the client IP for HTTPS connection. I already use externalTrafficPolicy: Local in my environment, and have done so since before I needed ssl-passthrough. |
After a long discussion and triaging on DM in slack, the summary of the issue is seen in below screenshot
Observation is that project does not support use-proxy-protocol without enabling proxy-protocol in a L4 LB in front of the controller. But user expects real-client-ip on on-premises cluster along with ssl-passthrough enabled on controller for some ingresses |
/triage-accepted /remove-triage needs-information |
/triage accepted |
/assign |
This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach |
/remove-lifecycle frozen |
What happened:
When attempting to do anything requiring HTTP, such as the HTTP-01 ACME challenge, it fails and I get the following error:
This is because I have
use-proxy-protocol: "true"
in the configmap, which is needed for SSL passthrough to work while preserving the client IP address. The TCP proxy on 443 wraps the connection in the PROXY protocol before sending it to 442, so HTTPS works, but there is no proxy on 80.What you expected to happen:
I configure SSL passthrough and HTTP continues to work
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):
Kubernetes version (use
kubectl version
):Client Version: v1.29.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.4+k3s2
Environment:
uname -a
): 5.14.0-362.8.1.el9_3.x86_64Current State of the controller:
How to reproduce this issue:
Create kind cluster
Install the ingress controller and http backend
Observe that both HTTP and HTTPS succeed
Observe that HTTPS succeeds, but HTTP errors:
Anything else we need to know:
As far as I can see, the options of how to fix this are:
The text was updated successfully, but these errors were encountered: