How to provision/deprovision EC2 nodes dynamically for purpose of Airflow Workers? #735
-
I am using Airflow in my AWS EKS cluster, and I am using KubernetesExecutor. Some of my DAGs run a task that does a ML training every once a week in the Airflow Worker. The worker in default will be a Kubernetes Pod defined in This training requires quite a lot of vCPUs and Memory (e.g., 24 vCPUs, 64GiB Memory), but doesn't necessarily take a lot of time (e.g., it ends in about an hour). So, I want KubernetesExecutor to request for an EC2 node that meets above requirement (e.g., m5.8xlarge) when the DAG is triggered, and de-provision (or terminate) the node from the cluster after the task is finished. I don't want an m5.8xlarge instance stay up all the time in my cluster just for an hour of training per week. Is this possible? It would be perfect if I can choose and configure different Operator for each DAG, since not all DAGs do ML training tasks, and if I can freely provision and de-provision nodes in which Workers (Kubernetes Pods) temporarily reside. Where in the values.yaml should I change? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
@sunhongmin225 I actually answered a very similar question in #722 (comment), I recommend taking a look at that answer. There are a few ways you can currently achieve this using the current chart, but in the future, I am going to make a new autoscaling feature available in a which should make this much easier. |
Beta Was this translation helpful? Give feedback.
@sunhongmin225 I actually answered a very similar question in #722 (comment), I recommend taking a look at that answer.
There are a few ways you can currently achieve this using the current chart, but in the future, I am going to make a new autoscaling feature available in a which should make this much easier.