Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running executor pods with different nodeselectors #2329

Open
ujjawal-khare-27 opened this issue Nov 21, 2024 · 8 comments
Open

Running executor pods with different nodeselectors #2329

ujjawal-khare-27 opened this issue Nov 21, 2024 · 8 comments

Comments

@ujjawal-khare-27
Copy link

What question do you want to ask?

  • [ X] ✋ I have searched the open/closed issues and my issue is not listed.

I have a requirement where I want to give users flexibility to choose number of spot and on-demand executors. Is there any way by which I can achieve this?

Additional context

No response

Have the same question?

Give it a 👍 We prioritize the question with most 👍

@jacobsalway
Copy link
Member

jacobsalway commented Nov 21, 2024

Hey, do you mean mixing executors between spot and on-demand nodes? For example, 40% on spot and 60% on on-demand?

@ujjawal-khare-27
Copy link
Author

Yes @jacobsalway.

@jacobsalway
Copy link
Member

jacobsalway commented Nov 23, 2024

The properties for executors in Spark on Kubernetes apply to all executors, so I think the answer to your question on different node selectors for different executors is that you can't. However I think this could be done at the node provisioning and/or scheduling level. Here are some approaches that come to mind:

  • If on AWS, you could use a node group with the desired mix of spot and on-demand.
  • If using Karpenter, you could try this guide to launch a mix of spot and on-demand and force the scheduler to distribute the executors pods using topology spread constraints. However neither Spark or the operator support topology spread constraints right now, so we'd need to add new functionality to support this.
  • Not use the operator at all and run Spark in standalone mode with separate StatefulSets targeting different node selectors.

@ujjawal-khare-27
Copy link
Author

Thanks for replying @jacobsalway . We can do that at scheduler level but our use case is more like of creating spark as a service in which user can specify these properties.

I am willing to submit a PR for this, in my opinion this will help others as well. Let me know your thoughts.

@jacobsalway
Copy link
Member

Could you go into more detail on how this feature would look? Is it something akin to EMR instance fleets?

@ujjawal-khare-27
Copy link
Author

I was thinking more in directions to take array as a type in executor rather than having single executor. It will help to extend other functionalities as well.

@jacobsalway
Copy link
Member

jacobsalway commented Nov 23, 2024

Obviously we're welcome to all PRs and will happily review, but I think you might find some difficulty in trying to implement this. Spark on Kubernetes doesn't support any concept of executor groups/fleets, so even if the SparkApplication spec supported this, I'm not sure how you'd construct the spark-submit arguments. I think this would require significant changes to the Kubernetes backend in Spark core.

@ujjawal-khare-27
Copy link
Author

Will check and get back to you on this @jacobsalway .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants