Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARC Finalizers Cause a deadlock when uninstalling AutoscalingRunnerSet in ArgoCD #3440

Open
4 tasks done
rteeling-evernorth opened this issue Apr 15, 2024 · 9 comments · May be fixed by #3447
Open
4 tasks done

ARC Finalizers Cause a deadlock when uninstalling AutoscalingRunnerSet in ArgoCD #3440

rteeling-evernorth opened this issue Apr 15, 2024 · 9 comments · May be fixed by #3447
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers

Comments

@rteeling-evernorth
Copy link

rteeling-evernorth commented Apr 15, 2024

Checks

Controller Version

0.9.0

Deployment Method

ArgoCD

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

To Reproduce

1. Install the ARC AutoscalingRunnerSet as an ArgoCD app
2. Uninstall ArgoCD app

Describe the bug

The argocd app cannot be deleted because Argo tries to delete resources that are normally deleted by the ARC controller and normal deletion is blocked by the actions.github.com/cleanup-protection finalizer
including:

  • AutoscalingRunnerSet
  • kube mode service account
  • kube mode role
  • rs manager role
  • kube mode role binding
  • rs manager role binding

Describe the expected behavior

The chart should cleanly uninstall when i delete the argocd app

Additional Context

I suspect this may be resolvable by helm/argo annotations for the affected resources, I will test with a fork of the helm chart

Controller Logs

My employer's open source contribution policy forbids me from creating public Gists. I will provide redacted logs upon request via zip/tarball.

Runner Pod Logs

My employer's open source contribution policy forbids me from creating public Gists. I will provide redacted logs upon request via zip/tarball.
@rteeling-evernorth rteeling-evernorth added bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers labels Apr 15, 2024
@rteeling-evernorth rteeling-evernorth changed the title ARC Finalizers Cause a deadlock when unisntalling AutoscalingRunnerSet ARC Finalizers Cause a deadlock when uninstalling AutoscalingRunnerSet in ArgoCD Apr 15, 2024
@rteeling-evernorth
Copy link
Author

I was successfully able to fix the issue using argocd annotations argocd.argoproj.io/sync-options: Delete=false and argocd.argoproj.io/sync-wave: "1", I'll open a PR

@rteeling-evernorth rteeling-evernorth linked a pull request Apr 16, 2024 that will close this issue
@rteeling-evernorth
Copy link
Author

@nikola-jokic could I trouble you for your two cents on this?

@ahatzz11
Copy link

ahatzz11 commented Apr 23, 2024

@rteeling-evernorth Thanks for making an issue for this - I've run into this issue on nearly every upgrade and end up having to turn off syncing, manually delete the finalizers, and then turn syncing back on and it ends up sorting itself out. I also have noticed that even if a new controller comes up (say 0.9.1 when upgrading from 0.9.0) it doesn't seem like that new controller properly handles the finalizers on the scale-set resources that are on the old version.

I do use sync-waves on the top level resources to ensure the gha-runner-scale-set-controller syncs first and the gha-runner-scale-set-figure-linux-standard syncs second:

metadata:
  name: gha-runner-scale-set-controller
  namespace: argocd
  annotations:
    argocd.argoproj.io/sync-wave: "1" # before scale-set

---

metadata:
  name: gha-runner-scale-set-figure-linux-standard
  namespace: argocd
  annotations:
    argocd.argoproj.io/sync-wave: "2" # after controller

Looking at your PR you set argocd.argoproj.io/sync-wave: "1" - would that conflict in my case with my controller having the same sync wave? Would it makes sense to go negative on the sync-wave for the autoscalingrunnerset to always ensure that runs first?

@rteeling-evernorth
Copy link
Author

rteeling-evernorth commented Apr 24, 2024

@ahatzz11 Had the same issue with having to manually resolve the finalizer deadlock. I've installed the controller in an entirely separate ArgoCD app, since the cardinality of the scale set controller to scale sets is 1:N. I assume GitHub is distributing the controller and runner set charts separately because of this. You should separate them into separate ArgoCD apps for the sake of ease of maintenance if nothing else.

Argo Sync waves are only relevant within the context of an Argo App, so as long as the two charts are managed by separate apps, you won't see any conflict. Once this is done you just uninstall any app for the scale set, then uninstall the controller and you won't have any deadlock issues.

Hope that helps!

@ahatzz11
Copy link

@rteeling-evernorth Cool thanks for the explanation - I do have each chart managed as separate apps so we should be good to go there!

@rteeling-evernorth
Copy link
Author

rteeling-evernorth commented Apr 30, 2024

@nikola-jokic

What are your thoughts on this issue and corresponding PR?

@ahatzz11
Copy link

ahatzz11 commented May 6, 2024

@nikola-jokic @Link- Is it possible to get this issue/PR looked at for the next release? It would be awesome to have upgrades massively simplified for argocd users.

@Link-
Copy link
Collaborator

Link- commented May 6, 2024

Thanks for adding the appropriate labels. It's on our radar now.

@rteeling-evernorth
Copy link
Author

@ahatzz11 did my change work for you? I'd prefer to not cause an "It works on my machine" sort of problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants