Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metropolis: implement removing nodes #262

Open
fionera opened this issue Aug 31, 2023 · 8 comments
Open

metropolis: implement removing nodes #262

fionera opened this issue Aug 31, 2023 · 8 comments
Assignees
Labels
c/cluster Cluster management and identity (decoupled from k8s) enhancement New feature or request
Milestone

Comments

@fionera
Copy link
Contributor

fionera commented Aug 31, 2023

Currently there is no way to remove a node from a cluster

@fionera fionera added enhancement New feature or request c/cluster Cluster management and identity (decoupled from k8s) labels Aug 31, 2023
@fionera fionera changed the title core: add ability to remove a node metropolis: implement removing nodes Aug 31, 2023
@q3k
Copy link
Contributor

q3k commented Sep 5, 2023

There's two things here:

  1. A proper, safe decomissioning flow, including draining, Kubernetes state sync, etc.
  2. A way to just remove dead nodes and generally 'clean up'.

We mostly need the latter, so that's what we should probably implement first. Then the decomissioning flow will include the latter, either automatically or as part of the process to be done by operators.

@q3k q3k added the good first issue Good for newcomers label Sep 5, 2023
@q3k
Copy link
Contributor

q3k commented Sep 5, 2023

(good first issue for the 'clean up' part)

@q3k q3k self-assigned this Oct 10, 2023
@q3k q3k added this to the OS v1.0 milestone Jan 30, 2024
@fionera
Copy link
Contributor Author

fionera commented Jun 21, 2024

Isnt this already implemented?

@q3k
Copy link
Contributor

q3k commented Jul 2, 2024

Partially - we can remove everything but control plane nodes.

@fionera
Copy link
Contributor Author

fionera commented Jul 4, 2024

We need this to be done until mid-september

@q3k q3k removed the good first issue Good for newcomers label Aug 21, 2024
@q3k
Copy link
Contributor

q3k commented Aug 21, 2024

With https://review.monogon.dev/c/monogon/+/3343 merged we will be able to safely and fully remove nodes from consensus.

A full decommissioning flow is blocked by SPIFFE support and general auth refactor, as that mostly revolves around distributing some revocation lists to make sure a node can't reconnect to the cluster. However, we can still remove non-decommissioned nodes by setting DeleteNodeRequest.SafetyBypassNotDecommissioned.

@q3k q3k assigned jscissr and unassigned q3k Sep 10, 2024
@jscissr
Copy link
Contributor

jscissr commented Sep 10, 2024

After removing etcd membership from a node, etcd panics which takes down the entire node. The panic is in IsLocalMemberLearner, stack trace:

go.etcd.io/etcd/server/v3/etcdserver/api/membership.(*RaftCluster).IsLocalMemberLearner(0xc000879380)
	external/gazelle~~go_deps~io_etcd_go_etcd_server_v3/etcdserver/api/membership/cluster.go:859 +0x26d
go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).IsLearner(0xc001020c18?)
	external/gazelle~~go_deps~io_etcd_go_etcd_server_v3/etcdserver/server.go:2813 +0x1a
go.etcd.io/etcd/server/v3/etcdserver/api/v3rpc.(*maintenanceServer).Status(0xc001020bb0, {0xc001d38638?, 0xc001d38638?}, 0xc001d38690?)
	external/gazelle~~go_deps~io_etcd_go_etcd_server_v3/etcdserver/api/v3rpc/maintenance.go:228 +0x137
go.etcd.io/etcd/server/v3/etcdserver/api/v3rpc.(*authMaintenanceServer).Status(0x4a5c340?, {0x3122cf8?, 0xc0026f2f00?}, 0x2d0ebd8?)
	external/gazelle~~go_deps~io_etcd_go_etcd_server_v3/etcdserver/api/v3rpc/maintenance.go:306 +0x25

I'm not sure how to fix the panic.

  • We could just return false in IsLocalMemberLearner instead of panicking. This is technically correct in that a node that is no longer a member is also not a learner anymore, but clients might not expect this.
  • We could change the function to return (isLearner bool, ok bool) or similar to force clients to handle the case where the local node is not a member. But this is a breaking change in a public interface and thus also not ideal.
  • Some existing callers of IsLearner first check IsMemberExist to avoid the panic, but that does not completely solve the problem because it's possible that a member is removed between the two calls.

Another way to improve the situation is to move etcd into a separate process such that a panic in etcd does not restart the entire machine. This is #349.

@jscissr
Copy link
Contributor

jscissr commented Sep 26, 2024

Removing consensus nodes is now possible with these changes merged:

https://review.monogon.dev/c/monogon/+/3437
https://review.monogon.dev/c/monogon/+/3438
https://review.monogon.dev/c/monogon/+/3454

There are still some rough edges that could be improved:

Bootstrap node does not stop consensus

The bootstrap node will not stop consensus when it has never rebooted and the role is removed, because the bootstrap data takes priority over the role.

// The only problem is when we remove a ConsensusMember from a node which still
// has BootstrapData lingering from first bootup. However, we currently do not
// support removing consensus roles.

Move leadership

When removing the consensus role from a node which is currently either etcd or curator leader, there is some downtime until a new leader is elected. It would be nicer to first move leadership to another node in this case. The challenge with moving the leadership is that maintenance.MoveLeader can only be called on the etcd leader node, and etcd leadership is currently independent from curator leadership. We could solve that by making the curator leadership follow etcd leadership, which might have performance benefits as well.
See discussion in https://review.monogon.dev/c/monogon/+/3437

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/cluster Cluster management and identity (decoupled from k8s) enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants