Does fancing work with control-plane nodes too? #22

tumurzakov · 2022-01-04T11:23:43Z

I manage bare metal cluster and suffer from downtime when nodes failing

Worker node failed - status become NotReady but traffic still come to the pods that was on this node. Often this leads to downtime if there was critical components. Now I need to delete this node from cluster to reshedule pods. Nothing special for your project.
Control plane node failed - that was only one time. I have 3 control plane nodes. After one node failed i immediatly deleted it from cluster. But nothing works because kubedns stoped resolving internal names. As I understand this happend because remaing etcd daemons didn't reach an agreement who become a master. Only systemctl restart docker on one of two working control plane nodes resolves an issue. So, could kube-fencing be used on control-plane nodes as well?

The text was updated successfully, but these errors were encountered:

kvaps · 2022-01-06T20:36:31Z

Do you have coredns pods running on single node?

tumurzakov · 2022-01-07T05:43:45Z

No,

# kubectl -n kube-system get pods -o wide|grep coredns
coredns-758cc77499-crbmv                1/1     Running   0          22d     10.244.0.4       kube-master-3   <none>           <none>
coredns-758cc77499-z2flm                1/1     Running   0          40d     10.244.7.16      kube-master-2   <none>           <none>

I have 3 master nodes, but coredns deployment contain 2 replicas. Something wrong in configuration?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does fancing work with control-plane nodes too? #22

Does fancing work with control-plane nodes too? #22

tumurzakov commented Jan 4, 2022

kvaps commented Jan 6, 2022

tumurzakov commented Jan 7, 2022

Does fancing work with control-plane nodes too? #22

Does fancing work with control-plane nodes too? #22

Comments

tumurzakov commented Jan 4, 2022

kvaps commented Jan 6, 2022

tumurzakov commented Jan 7, 2022