Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAG] Support multi-args and kwargs in Accelerate DAG #45545

Merged
merged 7 commits into from
May 29, 2024

Conversation

ruisearch42
Copy link
Contributor

@ruisearch42 ruisearch42 commented May 24, 2024

Why are these changes needed?

Currently accelerated DAG only supports a single argument per task. We would like to add general support for multi-args and kwargs.

This PR adds multi-arg and kwarg support by serializing all positional args and kwargs and passing it through the channel. When the channel is read at runtime, the individual args are extracted first before passing to the consuming tasks.

Related issue number

Closes #42793

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@ruisearch42 ruisearch42 changed the title [ADAG] Support args and kwargs in Accelerate DAG [ADAG] Support multi-args and kwargs in Accelerate DAG May 24, 2024
Copy link
Contributor

@stephanie-wang stephanie-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! The logic looks good to me; I left some suggestions for cleanup.

python/ray/dag/compiled_dag_node.py Outdated Show resolved Hide resolved
python/ray/dag/compiled_dag_node.py Outdated Show resolved Hide resolved
python/ray/dag/compiled_dag_node.py Outdated Show resolved Hide resolved
python/ray/dag/tests/experimental/test_accelerated_dag.py Outdated Show resolved Hide resolved
@stephanie-wang stephanie-wang added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label May 24, 2024
@stephanie-wang stephanie-wang self-assigned this May 24, 2024
Signed-off-by: Rui Qiao <[email protected]>
Signed-off-by: Rui Qiao <[email protected]>
@ruisearch42 ruisearch42 added the go add ONLY when ready to merge, run all tests label May 26, 2024
ruisearch42 and others added 2 commits May 27, 2024 07:22
@ruisearch42 ruisearch42 marked this pull request as ready for review May 27, 2024 21:31
@ruisearch42 ruisearch42 removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label May 27, 2024
python/ray/dag/compiled_dag_node.py Outdated Show resolved Hide resolved
python/ray/dag/compiled_dag_node.py Show resolved Hide resolved
Signed-off-by: Rui Qiao <[email protected]>
@stephanie-wang stephanie-wang merged commit 8752932 into ray-project:master May 29, 2024
6 checks passed
ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Jun 6, 2024
…5545)

This PR adds multi-arg and kwarg support by serializing all positional
args and kwargs and passing it through the channel. When the channel is
read at runtime, the individual args are extracted first before passing
to the consuming tasks.

Closes ray-project#42793
---------

Signed-off-by: Rui Qiao <[email protected]>
Signed-off-by: Rui Qiao <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Jun 6, 2024
…5545)

This PR adds multi-arg and kwarg support by serializing all positional
args and kwargs and passing it through the channel. When the channel is
read at runtime, the individual args are extracted first before passing
to the consuming tasks.

Closes ray-project#42793
---------

Signed-off-by: Rui Qiao <[email protected]>
Signed-off-by: Rui Qiao <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Jun 7, 2024
…5545)


This PR adds multi-arg and kwarg support by serializing all positional
args and kwargs and passing it through the channel. When the channel is
read at runtime, the individual args are extracted first before passing
to the consuming tasks.

Closes ray-project#42793
---------

Signed-off-by: Rui Qiao <[email protected]>
Signed-off-by: Rui Qiao <[email protected]>
richardsliu pushed a commit to richardsliu/ray that referenced this pull request Jun 12, 2024
…5545)

This PR adds multi-arg and kwarg support by serializing all positional
args and kwargs and passing it through the channel. When the channel is
read at runtime, the individual args are extracted first before passing
to the consuming tasks.

Closes ray-project#42793
---------

Signed-off-by: Rui Qiao <[email protected]>
Signed-off-by: Rui Qiao <[email protected]>
Signed-off-by: Richard Liu <[email protected]>
stephanie-wang added a commit that referenced this pull request Jun 27, 2024
A performance regression was noticed after #45545 was merged, as the
node input was changed from a single bytes object to a tuple of lists
(to support multiple arguments).

```
Before:
[unstable] compiled single-actor DAG calls per second 35571.54 +- 416.15
After Regression:
[unstable] compiled single-actor DAG calls per second 11037.4 +- 59.79
```

The default serialization engine in Ray does not need to serialize a
bytes object, but it does have to serialize other Python types,
including tuples of lists. Thus, the serialization engine needed to
perform more work with tuples of lists.

The default serialization engine took ~30 µs to serialize the tuple of
lists, which seemed expensive.

Passing a single bytes object is a common case, so this PR creates a
fast path for this case while passing everything else (multiple
arguments, nested Python objects, etc.) to the Ray serializer.

```
Disable Multi-Args (New Baseline):
[unstable] compiled single-actor DAG calls per second 28858.82 +- 270.54
Multi-Args with Fix:
[unstable] compiled single-actor DAG calls per second 27812.58 +- 536.67
```

The PR that introduced the regression was merged nearly a month ago, so
the "Before" case above is a bit stale. Thus, I disabled multi-args and
use that as the new baseline to which I compare the performance results
of this current PR.

---------

Signed-off-by: Jack Humphries <[email protected]>
Signed-off-by: Stephanie Wang <[email protected]>
Co-authored-by: Stephanie Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[core][dags] Support multiple args and kwargs in accelerated DAGs
2 participants