-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CA isn't scaling up due to unfamiliar extended resources requests #6804
Comments
Hi @MaciekPytel. I found out a similar issue in #3852 but then got impression that CA does ignore extended resources which are unfamiliar to it and move on with scaling up. Has that behavior changed recently? |
@x13n any thoughts on this? |
CA will observe other nodes in a node group to get an idea how new nodes will look like - if they contain extended resources that are later required by a pod, a scale up should get triggered. However, if there are no nodes or if there's a mix of node with and without the extended resource, CA will have no idea about such extended resource and won't trigger scale up because it will think it won't help the pod. |
Is there way to tell CA to ignore the resources it is unaware of ? |
I don't think so - why do you ask? What would be the use case for ignoring some resources? |
In my situation, there's a controller in place that adds extended resources to new nodes. These extended resources are later requested by some pods, just like the native resources CPU and memory. Let's take an example: imagine I have two nodes in my cluster that are almost maxed out, leaving no free resources for new pods. Now, I want to create a workload Pod that requests extended resources. The CA realizes it can't fit these Pods onto existing nodes due to the resource shortage (in this case, the extended resources like |
If CA ignored extended resources on both pods and nodes, it could scale a node group that won't have the extended resource later on. Or it could also create nodes for pods requesting, say, |
I think DRA would solve this eventually, if the controller that adds the custom resources is migrated to do it via DRA instead. I know there are also some discussions about integrating existing custom resources with DRA "automatically", so it might not even need changes in the controller. The plan is to get Cluster Autoscaler to work with Structured Parameters DRA in Kubernetes 1.31 FWIW. If this is needed earlier, or in older minor versions, a new |
Which component are you using?:
cluster-autoscaler
What version of the component are you using?:
CA: 9.37.0
K8s: 1.29
What k8s version are you using (
kubectl version
)?:Client Version: v1.28.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.3-eks-adc7111
What environment is this in?:
Amazon EKS
What did you expect to happen?:
Hi. We have a scenario where extended resources are added to nodes only after nodes start. However, I see that CA is not creating a node for pods that have extended resources specified in their spec. My understanding was as such that, CA will ignore resources it is unaware of and move on with scaling up, which isn't the case.
What happened instead?:
CA didn't scale the Nodes.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
The text was updated successfully, but these errors were encountered: