-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip update part in pdtrord for current WINDOW if NWIN = 0 #72
base: master
Are you sure you want to change the base?
Conversation
Is this a proper solution? Isn't the issue that the parallelization triggers NWIN=0 at all? |
Thanks for asking. The answer has 2 parts: A. This is a proper solution for the tests cases. Some other reasons for this fix:
IF( FLOPS.NE.0 ) THEN
IF( ( FLOPS*100 ) / ( 2*NWIN*NWIN ) .GE. MMULT ) THEN
[...]
ELSE
[...]
END IF
END IF But this solution, in my opinion, would also be a problem if for some reason B. This is not a proper solution for the possible communication issue we have. I will give a short report for what I obtained.
We are still investigating this problem. |
…sted by Meiyue Shao
d6ce945
to
9a9e4a0
Compare
I would like to add a few more information to this bug. Before I see this post, I used multiple compiler-mpi combination to compile and test scalapack-2.2.0. I tried gcc-8.2.0+openmpi-4.0.0, gcc-10.1.0+openmpi-4.0.5, gcc-11.2.0+openmpi-4.1.2 and intel_oneapi-2021. The gcc-8.2.0 and intel-2021 passed all test; while using the compiler gcc-10 and gcc-11 to compile will fail for Test #69 xshseqr and Test #70: xdhseqr. These two tests failed only for mpirun with np = 2-8; the tests failed no matter you are using gcc-10 or gcc-11 or gcc-8 during run-time. If you compile with gcc-8 and test with gcc-10, it goes fine. The solution given by [weslleyspereira] is effective, but remember you need to edit both "pstrord.f" and "pdtrord.f" in order to pass those two tests, respectively. It is somewhat surprised to me that the failed test results from a program bug instead of settings from the compiler or mpi, and I have spent a lot of time checking and comparing the settings and object files in my system. So I write this post hoping people with the same situation can find this solution earlier and save them some time :) |
In case this PR is still reasonable, I just wanted to add that Meiyue Shao, one of the main contributors to this code, suggested we look at https://dl.acm.org/doi/10.1145/2699471 to check if the current implementation is correct. I don't have time to work on that now, but, please, feel welcome to take from where I stopped here. |
Try simple solution to solve #69.