You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. What kops version are you running? The command kops version, will display
this information.
1.28.4
2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
1.28.10
3. Relevant cluster manifest portion:
metricsServer:
enabled: trueinsecure: false
9. Anything else do we need to know?
If the metrics server addon is enabled and there are less than two non-master nodes, then cluster validation will be failing indefinitely:
I0521 18:26:05.985979 76254 instancegroups.go:563] Cluster did not pass validation, will retry in "30s": system-cluster-critical pod "metrics-server-5c45c474f5-t8ppf" is pending.
The reason for this is that the deployment manifest specifies that there must be two replicas and topology spread constraints are defined in such a way that these two replicas must run on different nodes:
If we have only one non-master node, this results in one of the pods staying forever in the Pending state (because the only other node is the master, which is tainted and metrics-server doesn't have a respective toleration):
Since having a cluster with just one worker node and the metrics server addon enabled at the same time is a valid use case, such manifests that prevent the cluster from validating successfully (e.g., on kops rolling-update cluster) in this scenario, this should be considered a bug.
An ideal solution would be to make both the number of replicas and the maxSkew parameters configurable in the cluster spec. Less than ideal would allow to configure only one of them or hardcode relaxed topology spread constraints permanently and call it a day.
Another approach is to stop treating metrics-server pods as system-cluster-critical, because they aren't all that critical really.
The text was updated successfully, but these errors were encountered:
/kind bug
1. What
kops
version are you running? The commandkops version
, will displaythis information.
1.28.4
2. What Kubernetes version are you running?
kubectl version
will print theversion if a cluster is running or provide the Kubernetes version specified as
a
kops
flag.1.28.10
3. Relevant cluster manifest portion:
9. Anything else do we need to know?
If the metrics server addon is enabled and there are less than two non-master nodes, then cluster validation will be failing indefinitely:
The reason for this is that the deployment manifest specifies that there must be two replicas and topology spread constraints are defined in such a way that these two replicas must run on different nodes:
If we have only one non-master node, this results in one of the pods staying forever in the Pending state (because the only other node is the master, which is tainted and metrics-server doesn't have a respective toleration):
Since having a cluster with just one worker node and the metrics server addon enabled at the same time is a valid use case, such manifests that prevent the cluster from validating successfully (e.g., on kops rolling-update cluster) in this scenario, this should be considered a bug.
An ideal solution would be to make both the number of replicas and the maxSkew parameters configurable in the cluster spec. Less than ideal would allow to configure only one of them or hardcode relaxed topology spread constraints permanently and call it a day.
Another approach is to stop treating metrics-server pods as system-cluster-critical, because they aren't all that critical really.
The text was updated successfully, but these errors were encountered: