Add GPU support to run only DAG's #722
-
Hello everybody. I am new in airflow and I have many machines with labels like, just cpu, just gpu and so on. I would like to deploy airflow in cpu as well but when run the dags I need run in gpu machine. There is a solution for it ? I can add a nvidia label in the resources spec but I dont know if I need add in scheduler or worker.... in that case I need just add a gpu when run a dag and for the rest component I would like to deploy using the cpu machine. Thank you |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
@antikilahdjs I actually have a solution for this coming up, with the task-aware auto-scaler feature, which will allow automatically scaling up/down the celery workers (with some clever logic to prevent scaling down workers which are actively doing stuff, unless you label the task as "safe to interrupt"). We will support having multiple "queues" of celery workers, for example, you might have a "default" queue with CPUs only, and a "gpu" queue with GPUs. The auto-scaler will then allow you to scale up the "gpu" queue only when tasks are waiting in that queue, and scale it down when it's no longer needed. Before the new auto-scaler is finished, you can actually achieve GPU support in a less elegant way by either:
This is a lot of information, if you are doing this for a company, I do offer consulting services if you're interested! |
Beta Was this translation helpful? Give feedback.
@antikilahdjs I actually have a solution for this coming up, with the task-aware auto-scaler feature, which will allow automatically scaling up/down the celery workers (with some clever logic to prevent scaling down workers which are actively doing stuff, unless you label the task as "safe to interrupt").
We will support having multiple "queues" of celery workers, for example, you might have a "default" queue with CPUs only, and a "gpu" queue with GPUs. The auto-scaler will then allow you to scale up the "gpu" queue only when tasks are waiting in that queue, and scale it down when it's no longer needed.
Before the new auto-scaler is finished, you can actually achieve GPU support in a les…