Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: Kube(CPU|Memory)QuotaOvercommit alerts use max number of nodes instead of current ones #984

Open
4 tasks done
afarbos opened this issue Nov 4, 2024 · 0 comments
Assignees
Labels
keepalive Use to prevent automatic closing

Comments

@afarbos
Copy link

afarbos commented Nov 4, 2024

What's the general idea for the enhancement?

Currently both alerts (KubeCPUQuotaOvercommit and KubeMemortQuotaOvercommit) compute the total available capacity using the current amount of nodes in the cluster.
However, this alert can become really noisy if the current amount of nodes is not the maximum amount of nodes available.
Today, a lot of the time k8s clusters are deployed with cluster autoscaler or come with it builtin (AKS, GKE...).
We should have a way to influence those alerts to retrieve the maximum amount of nodes instead of looking at the current one.

See this metric that could be helpful for example: see https://github.com/kubernetes/autoscaler/blob/213a8595ea2bddf433dd56e50c31ca868ef1da80/cluster-autoscaler/metrics/metrics.go#L157-L163

Please provide any helpful snippets.

No response

What parts of the codebase does the enhancement target?

Alerts

Anything else relevant to the enhancement that would help with the triage process?

No response

I agree to the following terms:

  • I agree to follow this project's Code of Conduct.
  • I have filled out all the required information above to the best of my ability.
  • I have searched the issues of this repository and believe that this is not a duplicate.
  • I have confirmed this proposal applies to the default branch of the repository, as of the latest commit at the time of submission.
@skl skl self-assigned this Nov 5, 2024
@skl skl added the keepalive Use to prevent automatic closing label Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keepalive Use to prevent automatic closing
Projects
None yet
Development

No branches or pull requests

2 participants