Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: retry rust build on ubuntu #375

Merged
merged 1 commit into from
May 14, 2024
Merged

fix: retry rust build on ubuntu #375

merged 1 commit into from
May 14, 2024

Conversation

mikehardy
Copy link
Member

@mikehardy mikehardy commented May 13, 2024

it appears to converge over multiple runs to a working build

this is based on a report of convergent build behavior by Brayan, we'll see how it goes

Fixes:

@mikehardy
Copy link
Member Author

mikehardy commented May 13, 2024

it's the javascript / node build process that seems to be the problem

I think yarn2+ became the default in the ecosystem and on the runners quite recently, I wonder if that is it?

I put 5 retries in the build system here, it passed the 4th one 🤷

@mikehardy mikehardy requested a review from david-allison May 13, 2024 21:53
@mikehardy
Copy link
Member Author

  • would be nice to see the build log showing the exact thing that failed
  • only fails on ubuntu though, and we release from macOS so...low risk?

Is this "good enough" for now ?

@mikehardy
Copy link
Member Author

mikehardy commented May 13, 2024

(re-running it to see if it is stable, at least with n=2 on samples, it passed on the 4th build attempt as well, looks stable / identical each time at least at n=2)

@mikehardy mikehardy mentioned this pull request May 13, 2024
@david-allison
Copy link
Member

david-allison commented May 13, 2024

I believe I provided diffs of logs before/after on the issue

I don't believe this was a runner update. They appeared to be the same version when I did a brief diff in Beyond Compare

Copy link
Member

@david-allison david-allison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kinda nasty, as we haven't diagnosed. Approving with a request to add a comment explaining why

@mikehardy
Copy link
Member Author

mikehardy commented May 14, 2024

yeah ubuntu n+1 is rolling out now but you're right and I verify there was no runner update

I got an emulated Ubuntu VM running locally ("UTM" is a reasonably friendly QEMU wrapper for macOS, FWIW) and I reproduce it as well (took 4 failures, built on 5th)

but honestly, the build system is so opaque about what exactly it is running with regard to commands and which command exactly failed that it's irritating to work with.

I wish there was a "CI" / "dumb terminal" mode where it was verbose and not trying to be all ASCII-art progress-bar pretty so it just dumped what it was doing

So what I'll do is this, to lower risk, and provide bread crumbs for future intrepid explorers:

  • make it so that the retry is only > 1 on ubuntu, why? we do release builds on macos and I don't want hacky things on our release-generating OS
  • put a comment in mentioning that the failure / retry-need is reproducible, but does not have a root cause analysis yet and needs one

it appears to converge over multiple runs to a working build
@mikehardy
Copy link
Member Author

That worked, windows and macOS got 1 attempt configured, ubuntu got 5

macos https://github.com/ankidroid/Anki-Android-Backend/actions/runs/9072966388/job/24929313046?pr=375#step:13:4
windows https://github.com/ankidroid/Anki-Android-Backend/actions/runs/9072966388/job/24929312929?pr=375#step:13:4
ubuntu https://github.com/ankidroid/Anki-Android-Backend/actions/runs/9072966388/job/24929312726?pr=375#step:13:4

and ubuntu succeeded on the 3rd time that time 🤷

This unblocks PRs and is the minimum hack I can think of

@mikehardy mikehardy merged commit abfad82 into main May 14, 2024
5 of 6 checks passed
@mikehardy mikehardy deleted the retry_rust_build branch May 14, 2024 02:51
@dae
Copy link
Contributor

dae commented May 14, 2024

but honestly, the build system is so opaque

It's telling you which collection of build steps is failing already. Stdout/err is echoed on failure, so the fact it's not appearing here implies the command being run is not printing anything. I've told David how he can dig further: #373 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI Failure (ubuntu)
3 participants