You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My kube.tf config generally works perfectly fine and I am very pleased with the solution.
Recently I observed, that randomly single nodes go into state "NotReady,SchedulingDisabled". It is usually just a single node.
I do not have SSH-Access to that node anymore since the network is unreachable. A manual reboot would solve the issue but this can not be the permanent solution. I have to rely on my nodes staying in "Ready" state. The issue is not being automatically healed.
I reached out to Hetzner support to investigate if this is an issue on their side. The answer I got was, that the node did boot into "Maintenance mode" which usually happens due to kernel- or filesystem-related issues.
I assume that this issue might occur during automatic Kernel upgrade. When I reboot the stuck node, I can observe, that the nodes continue to automatically upgrade the kernel version.
treylade
changed the title
[Bug]: Nodes randomly booting into Maintenance-Mode
[Bug]: Nodes randomly booting into maintenance-mode and stuck in NotReady state
Oct 11, 2024
Description
Hello,
My kube.tf config generally works perfectly fine and I am very pleased with the solution.
Recently I observed, that randomly single nodes go into state "NotReady,SchedulingDisabled". It is usually just a single node.
I do not have SSH-Access to that node anymore since the network is unreachable. A manual reboot would solve the issue but this can not be the permanent solution. I have to rely on my nodes staying in "Ready" state. The issue is not being automatically healed.
I reached out to Hetzner support to investigate if this is an issue on their side. The answer I got was, that the node did boot into "Maintenance mode" which usually happens due to kernel- or filesystem-related issues.
Kubernetes version: v1.30.5+k3s1
Kernel version: 6.11.0-1-default
I am using Longhorn as storage provider.
Do you have any experience with that issue?
Best regards,
Lars
Kube.tf file
Screenshots
No response
Platform
Mac
The text was updated successfully, but these errors were encountered: