-
-
Notifications
You must be signed in to change notification settings - Fork 773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH actions long delay between finishing build job and starting success job #1376
Comments
You are correct in how you describe the behavior. We are probably also throttled a bit since we have such an intense job run. A runner is allocated for every build in the matrix. Maybe explicitly selecting a different runner class for the success job would get it allocated more quickly. |
Yeah presumably these runners are all counted against the Jazzband org. Can we try this without having to bug @jezdez? |
The reason we added the success job to the build process so we wouldn't need @jezdez to intercede to change the success criteria of our builds since we don't have settings access. We should be able to select the machine class by changing runs-on for the success job. Maybe we can get away without specifying it? I'm not sure what the default is... |
I think this is something we could maybe open with Github support? |
I assume we're waiting on the backlog of jazzband jobs and it's being slowed down by the concurrent job limit, https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration |
Another option may be to go ahead and reduce our matrix dropping django 4.0 and django 4.1 since they're no longer supported upstream. That should reduce our matrix by 10 jobs. Success still won't be enqueued until they're complete... |
Describe the bug
In watching multiple PRs after I've approved them, it appears to take a long time for the
success
job to start after the last step of thebuild
job has finished. See #1219 where the separate success job was added to make it easier to update the matrix and only ever depend on build to finish for tests to succeed.To Reproduce
Cause a PR to run tests.
Expected behavior
I didn't expect anything but was hoping that the wait for the success step wouldn't happen.
Version
current master branch
Additional context
@dopry I'm guessing that GH is allocating a runner(s) for each job, so after the build job finishes, we wait for another runner to become available for the success job. This takes a while. See below with timestamps selected. So I am guessing that running a second job that depends on the first has to wait for a new runner to become available. Sometimes correlation is indicative of causation.
Mon, 18 Dec 2023 17:59:18 GMT
last matrix step of build job finishedMon, 18 Dec 2023 18:31:45 GMT
success job startsWhile watching the PR, the success job status is waiting on a runner. Here's some raw log showing the 30 minute wait for a runner:
The text was updated successfully, but these errors were encountered: