-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add mechanism to split project tests into parallel jobs. #1696
Conversation
9a9e358
to
94ef31b
Compare
56881ce
to
24c2f16
Compare
6b0eeca
to
606f7c1
Compare
Split Thrust into TestCPU and TestGPU. Split CUB into TestGPU, HostLaunch, DeviceLaunch, and GraphCapture. Also adds an exclusion matrix and various other workflow features to support job splitting.
🟩 CI Results: Pass: 100%/341 | Total Time: 3d 20h | Avg Time: 16m 20s | Hits: 85%/465152
|
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
🏃 Runner counts (total jobs: 341)
# | Runner |
---|---|
240 | linux-amd64-cpu16 |
56 | linux-amd64-gpu-v100-latest-1 |
24 | linux-arm64-cpu16 |
21 | windows-amd64-cpu16 |
- 'test_nolid' | ||
- 'test_lid0' | ||
- 'test_lid1' | ||
- 'test_lid2' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: We should give these more meaningful names. Even when I already know roughly what these mean, I still have no idea what they mean :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be improved by #1758 by including the friendly name with the rest of the job definition.
That last push restores the It also fixes up some path issues that broke the usecase of calling |
🟩 CI Results: Pass: 100%/341 | Total Time: 2d 19h | Avg Time: 11m 53s | Hits: 98%/465152
|
Project | |
---|---|
+/- | CCCL Infrastructure |
libcu++ | |
+/- | CUB |
+/- | Thrust |
CUDA Experimental |
Modifications in project or dependencies?
Project | |
---|---|
+/- | CCCL Infrastructure |
+/- | libcu++ |
+/- | CUB |
+/- | Thrust |
+/- | CUDA Experimental |
🏃 Runner counts (total jobs: 341)
# | Runner |
---|---|
240 | linux-amd64-cpu16 |
56 | linux-amd64-gpu-v100-latest-1 |
24 | linux-arm64-cpu16 |
21 | windows-amd64-cpu16 |
Description
Adds a mechanism for a project's tests to be split into multiple jobs. The jobs will execute in parallel and may
have different requirements.
Thrust's coverage has been extended to include CPU backend tests. These share a build step with the GPU tests and do not require a GPU runner.
CUB's tests have been split into 4 jobs:
gpu
: no explicitlid_[0-2]
in test namelid0
: HostLaunch testslid1
DeviceLaunch testslid2
: StreamCapture testsThese run in parallel
Refs #1619.
Before / after for CUB test times:
There is a slight increase of 5-10 minutes overall due to the overhead of launching the runner and fetching the build artifacts. However, the jobs can now launch in parallel. That 1h20m job was often the long tail that held up the results, but now the jobs stay under 30 minutes.
Further possible improvements: