-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Fix leader election issue with workspace controller and other KCP Controllers #3111
✨ Fix leader election issue with workspace controller and other KCP Controllers #3111
Conversation
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @sankar17. Thanks for your PR. I'm waiting for a kcp-dev member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
55f9e70
to
987b9a8
Compare
|
||
if err := s.registerController(&controllerWrapper{ | ||
Name: workspace.ControllerName, | ||
Runner: func(ctx context.Context) { | ||
workspaceController, err := workspace.NewController( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this issue happens with every controller, we need to to do it for the rest of the controllers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this PR popped up I was wondering why it would only affect specific controllers, this seemed like a fairly fundamental flaw in how I implemented leader election.
@sankar17 is this something that you want to refactor at a bigger scale to address the architectural problem or would you like me to take a look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@embik This has to be fixed for all the controllers, initially we got stuck with workspace controller creation stopped working with our longevity tests / over night test with 200 workspace creation / deletion . We saw the kcp metrics and the request queued to all 3/3 pods, though the leader election is enabled.
Currently I am working on changing this logic for all the controllers and need to do more tests.
if you think this is going to be fixed architectural level , please take a look at it.
CC: @yastij @palnabarun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, taking a quick look at where queues are being started I think the approach in this PR is overall fine. @sankar17 feel free to carry on, we should change it for all controllers. I can also take over if you don't have time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@embik I did changes to all the controllers, currently testing the fix, if all good I will update this PR in few days
/ok-to-test |
11084cb
to
3ce207d
Compare
/retest |
/test pull-kcp-test-e2e |
/retest |
@sankar17 @ramramu3433 these failures across the e2e test board don't seem like flakes to me. Does |
/retest |
I will test and udpate |
@sankar17 Please consider not running re-tests when tests are failing consistently, at least not without any code changes pushed. Those tests burn CI cycles without any real reason, we already know that they don't work. |
Sure I will make sure it works in local and do retest |
Thanks! |
// APIBinding indexers | ||
indexers.AddIfNotPresentOrDie(s.KcpSharedInformerFactory.Apis().V1alpha1().APIBindings().Informer().GetIndexer(), cache.Indexers{ | ||
indexers.APIBindingsByAPIExport: indexers.IndexAPIBindingByAPIExport, | ||
}) | ||
|
||
// APIExport indexers | ||
indexers.AddIfNotPresentOrDie(s.KcpSharedInformerFactory.Apis().V1alpha1().APIExports().Informer().GetIndexer(), cache.Indexers{ | ||
indexers.ByLogicalClusterPathAndName: indexers.IndexByLogicalClusterPathAndName, | ||
indexAPIExportsByAPIResourceSchema: indexAPIExportsByAPIResourceSchemasFunc, | ||
}) | ||
indexers.AddIfNotPresentOrDie(s.CacheKcpSharedInformerFactory.Apis().V1alpha1().APIExports().Informer().GetIndexer(), cache.Indexers{ | ||
indexers.ByLogicalClusterPathAndName: indexers.IndexByLogicalClusterPathAndName, | ||
indexAPIExportsByAPIResourceSchema: indexAPIExportsByAPIResourceSchemasFunc, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for moving the indexer setup out of the controller creation to here (this happens in other parts of the code, so this is more of a placeholder reference for all those changes)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core reason we moved registration of indexers outside the controller.NewFooController
is because with this change, all of the controller.NewFooController
calls are inside the Runner
function which is called everytime a KCP pod becomes a leader.
indexers.AddIfNotPresentOrDie
is supposed to be run only once in the lifecycle of a KCP pod because otherwise it will panic.
The mental model is that:
- installFooController outside of Runner has steps that can or needed to run one-time only in the lifecycle of the KCP pod
- Runner has all the steps that can be run multiple times in the runtime of a Pod.
Note: The same KCP pod can somehow lose leader-election and get elected again in due course of time. The changes in this PR enables that scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@embik this is a little bit complicated to wrap at one go. Maybe we can explain over the community call later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@palnabarun I first understood indexers.AddIfNotPresentOrDie
to be idempotent since it filters out already present indexers (that's why I asked), but my feeling is the problem isn't that part of the logic, it's that calling indexer.AddIndexers
even with an empty list of indexers (because indexers.AddIfNotPresentOrDie
filtered out all existing indexers) will panic when the store is already started -- do I get this right? Or is the problem that it's called with an empty list of indexers (because they all already exist)?
I won't be on the next community call unfortunately, but I don't mind anyone else reviewing this and discussing it there either if that works better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this PR looks good to me but it needs:
- Squashed commits.
- DCO sign-off on all commits.
- a release note in the PR description.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
237b632
to
f57951c
Compare
/test pull-kcp-verify |
/retest-required |
6d230d9
to
804f5c4
Compare
804f5c4
to
2706416
Compare
indexAPIExportEndpointSliceByAPIExport = "indexAPIExportEndpointSliceByAPIExport" | ||
indexAPIExportEndpointSlicesByPartition = "indexAPIExportEndpointSlicesByPartition" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we qualify these more? indexWhatByWhat
|
||
return []string{}, nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would move all of these into an index.go. This file here is getting big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure , we wil do
kubeClusterClient, | ||
s.KcpSharedInformerFactory.Core().V1alpha1().LogicalClusters(), | ||
s.KubeSharedInformerFactory.Rbac().V1().ClusterRoleBindings(), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 What is the semantics here now in the Runner? Do we ensure the informer factor is started after all the runners? Looking at the code this can't be the case: controller.Start
blocks. This means we have a race of the individual informers not being started if the factory is faster than the runners.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In few controllers the wait for informer cache to sync logic is implemented using wait method before start.Can we just add this for all the controllers for which informers to be sycned before start ?
CC: @yastij @palnabarun
Signed-off-by: sankarm <[email protected]>
2706416
to
eb6571d
Compare
@sankar17: The following tests failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This is implemented with alternative approach #3132 |
This PR addresses the following,
Change the workspace controllers start logic inside runners to fix leader election issue.
The way we register controllers and define the runner is problematic, the runner calls start only. but in case leader election is lost start finishes (as it was waiting on <- ctx.Done()) which leads to the defer on the queue.Shutdown() to run. Once you shutdown a queue, there’s no way to restart it
Background:
At times we faced workspace controller creation stuck at scheduling phase and never recovers. Regarding leader election the requests/events queued to both leader and other pods aswell , this makes the queue depth to grow.