-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cilium health check fails towards control plane node #15329
Comments
Do you actually need these health checks to pass? Doesn't the HTTP checks supersede the ICMP checks? |
Based on our testing, cilium reports the node as unreachable if one of the 2 health check fails. In our case, because ICMP health check fails the node is considered as unreachable no matter if the HTTP check succeeds. The code which test node connectivity can be found here (cilium v1.12.9). Based on this code, its clear that both ICMP and HTTP health check must pass for a node to be considered as reachable. As we are building monitoring and included the cilium metric Basically our baseline for a healthy clusters means that every worker node has 3 unreachable nodes. Would it be acceptable to open ICMP traffic from worker SG to control-plane SG ? I can handle a PR to open ICMP flow from node to master only if Cilium is enabled if you'd like. |
As mentioned in the PR, we'll open ICMP to allow these checks to pass. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind bug
1. What
kops
version are you running?Client version: 1.26.2 (git-v1.26.2)
2. What Kubernetes version are you running?
kubectl version
will print theversion if a cluster is running or provide the Kubernetes version specified as
a
kops
flag.v1.25.9
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
Trying to follow kOps Getting Started with cilium enabled
5. What happened after the commands executed?
cilium-agent running on worker nodes reports that the health check towards control-plane nodes fails.
However on cilium-agent running on control plane nodes, cilium health checks are working not matter which nodes the target is. Basically, the AWS security group on control plane nodes does not allow inbound ICMP request from worker nodes.
6. What did you expect to happen?
Kops should have configured the control plane Security group to allow inbound ICMP request from worker nodes.
**7. Please provide your cluster manifest. Execute
N/A
8. Please run the commands with most verbose logging by adding the
-v 10
flag.Paste the logs into this report, or in a gist and provide the gist link here.
N/A
9. Anything else do we need to know?
N/A
The text was updated successfully, but these errors were encountered: