-
-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug]: k8s.KubernetesException: too old resource version: xxxx (xxxx) on re-connect #793
Labels
bug
Something isn't working
Comments
Yes, although it did retry, but the version is too far off, plus you won't catch up with missing event:
|
I've also encountered this issue in my environment. |
@buehler we have also encountered this. |
+1 |
@buehler Encountering the same - seems to build up over time - resolved for us by restarting the pods, which is obviously not ideal. |
We are also seing this issue at all our environments |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
When the operator has been running for a while and the watch of a resource has failed it correctly tries to re-connect. However when it tries to re-start the watch of the entity, it can return the error k8s.KubernetesException: too old resource version: xxxx (xxxx). This is because the operator has been in a re-connecting state long enought that the currently stored resource version (in memory int the opertor) is too far behind the resource version of the resource.
To reproduce
It is difficult to reproduce because it relies on a transient error to initiate it, however here is the profile of the issue.
There has to be an initial failure in the watch of a resource. In my case it was "The response ended prematurely while waiting for the next frame from the server. (ResponseEnded)".
Then on each successive re-connection attempt it fails with "too old resource version: 68358497 (68368462)"
here is the logs filtered to only what is relevant:
It continues in this way forever. Untill the pod is restarted.
Expected behavior
On error an error the watcher should start again (regardless of if the currentVersion is too old) so that the operator doesnt stop reconciling CR's
Screenshots
No response
Additional Context
We are using the KubeOps.Operator 9.1.2 nuget package.
I have created a PR with a suggested fix: #792
The text was updated successfully, but these errors were encountered: