Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation (and ASCII art) about streaming execution, and thread pools #13423

Merged
merged 9 commits into from
Nov 18, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Nov 14, 2024

Which issue does this PR close?

Rationale for this change

In order to understand the need for a second runtime, I needed to write up the background material so it made sense.

Also, this is starting to come up with dft:

What changes are included in this PR?

This PR adds background documentation on how DataFusion runs plans and why a separate Runtime may be needed to keep the network busy

Note I am also working on an example to show how to actually use a second runtime which I will link to these docs when it is ready

Also, I found myself on a ✈️ without WIFI so I also made a bunch of ASCII art while I was at it)

Are these changes tested?

By CI

Are there any user-facing changes?

Nice documentation hopefully!

@alamb alamb added the documentation Improvements or additions to documentation label Nov 14, 2024
@github-actions github-actions bot added core Core DataFusion crate and removed documentation Improvements or additions to documentation labels Nov 14, 2024
@alamb alamb changed the title Improve documentation (and ASCII art) about streaming and threadpools Improve documentation (and ASCII art) about streaming execution, and thread pools Nov 14, 2024
Copy link
Member

@jonahgao jonahgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the ASCII art is really nice. I left some suggestions for minor typos.

@2010YOUY01
Copy link
Contributor

The explanation for the problem is super clear

One thing I don't understand is: For Tokio's convention, all tasks created by spawn are expected to be IO-bounded, and those CPU-bound task should be created using spawn_blocking(). I think the implementation also uses two separate thread pool, which is very similar to the approach in this doc.
Why can't we all use spawn_blocking() for all CPU-bounded task, and instead we have to use two runtimes explicitly 🤔
If we follow this two runtime approach, would existing spawn_blocking() in the code (e.g.

builder.spawn_blocking(move || read_spill(sender, path.path()));
) cause unexpected behavior (3 thread pools co-exist)

@tustvold
Copy link
Contributor

tustvold commented Nov 16, 2024

I took a quick look at this and content looks good, couldn't check diagrams as on phone.

Why can't we all use spawn_blocking() for all CPU-bounded task, and instead we have to use two runtimes explicitly

We could, however, to preserve the thread per core architecture we would need to cap the threads of the blocking pool to the core count. Then as blocking tasks can't yield waiting for input, every CPU bound morsel would have to be spawned separately. This would give us a "morsel-driven" scheduler, however, tokio has a relatively high per task overhead and so even discounting the sheer amount of boilerplate this would require, the performance would be regrettable. If going down this path you might as well switch to an actual morsel driven scheduler (although this is at this stage likely intractable).

Ultimately spawn_blocking is designed for blocking IO, it is not designed for CPU bound tasks.

Copy link
Member

@jonahgao jonahgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM👍, thanks @alamb

@alamb
Copy link
Contributor Author

alamb commented Nov 17, 2024

Why can't we all use spawn_blocking() for all CPU-bounded task, and instead we have to use two runtimes explicitly 🤔

Thank you @2010YOUY01 for the question and @tustvold for the answer -- this question also has come up in the past so I added a summary in ad16179

Comment on lines +473 to +474
//! applications as explained in the [Using Rustlang’s Async Tokio
//! Runtime for CPU-Bound Tasks] blog.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes -- I think it does (line 584)

Comment on lines +478 to +479
//! During execution, DataFusion creates this many distinct `async` [`Stream`]s and
//! this many distinct [Tokio] [`task`]s, which drive the `Stream`s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was aware that target_partitions and number of Streams matched the number of CPU cores but I was not aware that we cap the number of Tokio tasks as well. Can you expand on that point / are we getting all the benefits of the tokio runtime (work stealing, etc) by keeping that number (relatively) low? for context the local task queue size per thread is 256 tasks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is somewhat nuanced -- what happens is that certain operations (like RepartionExec spawn tasks to read from the inputs. As written I now see this sentence is somewhat misleading. I will try and clarify in a follow on PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to clarify in #13474

@alamb
Copy link
Contributor Author

alamb commented Nov 18, 2024

Thank you everyone for the comments -- I am going to merge this one in and make some adjustments as a follow on PR to avoid keeping it open for too long. Long live comments!

@alamb alamb merged commit 498bcb9 into apache:main Nov 18, 2024
25 checks passed
@alamb alamb deleted the alamb/doc_threadpools branch November 18, 2024 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants