-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird nonlinear convergence in parallel #284
Comments
All assign @aabrown100-git since he was interested in looking into this (thank you!!). I'm happy to have a conversation here and figure out what may be causing this! |
My working theory was some kind of undefined behavior due to a tangent contribution being uninitialized. However, I looked through the I'm not sure it's a parallel bug, because I saw poor convergence with 1-proc in the 3rd timestep
|
Description
While upgrading macOS (#279), we discovered that the Newtonian fluid test failed on three procs, but passed on one and four. Those are the test outputs:
One and four procs reach the nonlinear tolerance of
1e-11
within4
Newton iterations specified when setting up the test (as documented in "Create a new test"). For three procs, it looks like the first two Newton iterations,1-1
and1-2
, are identical to one and four procs.However, the three-procs case exhibits much worse nonlinear convergence for steps
1-3
and1-4
. The linear convergence is identical for the first Newton iteration, where the initial solution is identical.Reproduction
It looks like in many test executions, the result with
Ri/R1 ~ 1e-6
was still good enough to pass until thismacos-latest
pipeline came along. For example, this pipeline (and potentially others) have the convergence problem but still passed the integration test.Expected behavior
Convergence behavior should be identical on one, three, and four procs, except for the minor differences visible between one and four.
Additional context
I suspect this could be a parallel bug in the linearization (since the result still passes the integration test with bad convergence). It shouldn't be the linear solver since the first Newton iteration (where the solution is still identical) has identical linear performance.
A first debugging step could be looking through recent pipelines and seeing if this problem appears consistently. Then, depending on what past pipelines show, it would make sense to isolate it by physics type and other properties.
Code of Conduct
The text was updated successfully, but these errors were encountered: