-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster.yaml can become out of date for killed nodes #175
Comments
You should be able to use a hostname instead of an IP in the address of a Node, would that also provide a solution to your case? |
I don't believe so. When Juju runs in a cloud AWS, gcp, etc we can get an unexpected termination, although rare, it can and does happen. What the issue is asking for, is to have one location where we manage the nodes in a cluster are located, without building another health check layer upon dqlite. If a node goes away without any intervention, how do we know the topology of the cluster as it is? In Juju if a node goes away it requires manual intervention to re-establish HA. Although we expect someone to have observability alerting to notify them if a node goes away, this isn't always the case. Juju uses controllers nodes as a load balancer, which means we could have nodes attempting to communicate to a controller node that doesn't exist for some time. At the bare minimum, we would want to see a |
What is the rationale for having the node addresses in the Raft log? Nodes have an ID, so addresses are not used to identify them... We're currently seeing more scenarios where the cluster is brittle, because rescheduled nodes/changing IPs are hosing clusters. The kindest usage scenario would be one in which we can modify cluster.yaml and bounce nodes. |
Recording addresses in the raft log is done so that log replication can be used to teach followers about changes to the cluster membership, both during normal operation and when they're joining for the first time/after a long time offline. cluster.yaml just exists to reduce the number of cases where you have manually tell a node about some current cluster member on startup -- if things aren't changing too rapidly and you come back online after a crash, hopefully at least one of the servers in cluster.yaml is still active. After startup we don't read from cluster.yaml, only refresh it periodically. |
Note that changing the IP of a node is currently not supported, unless you reconfigure the cluster manually. |
Indeed; this is the reason I ask. We have cases in Juju where we are do this, but that's due to topology changes that we're affecting. When it isn't in our control, such as when a node is rescheduled, our options are limited. |
If you are talking about k8s, you should be able to assign to nodes a stable identity (hostname) with things like StatefulSet. At that point the IP can change at will, since what will be recorded in Raft is the node identity (hostname). |
The
cluster.yaml
can become out of date if a node in the cluster is removed in a non-programmatic way or without user interaction. A typical scenario could be OOM'd node or restart that gives us a different IP address. In that case, thecluster.yaml
will still show the old node even if it's gone away, even after a substantial amount of time has passed.Having spoken with @MathieuBordere, a possible solution would be to include a last seen timestamp in the
cluster.yaml
and get the leader to run a goroutine to spot when the last seen timestamp is bigger than we can work with and then useclient.Remove()
.Alternatively, this could be done directly in the app abstraction in the run loop, and remove the nodes after a configurable timeout.
The text was updated successfully, but these errors were encountered: