Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set mpirun to None when a single MPI task is being used (similar to handling of mpi-serial) #4619

Open
ekluzek opened this issue Apr 26, 2024 · 6 comments
Labels
Low Priority Responsibility: CESM Responsibility to manage and accomplish this issue is through CESM Stale tp: CIMElib ty: Discussion

Comments

@ekluzek
Copy link
Contributor

ekluzek commented Apr 26, 2024

mpi-serial used to be required for building serially. It no longer is for modern MPI libraries. One of the benefits of mpi-serial is that it removes the complexity of having to use a mpirun command and options which is both different on different machines, adds complexity, makes it harder to port to a machine for a serial case, and often means you can't run a simple case on the command line.

There is special handling in CIME for mpi-serial that allows mpirun to be None. For this I think similar logic can be used to set it to None, in env_mach_specific.py in _find_best_mpirun_match.

The other way to handle this would be to add a setting for NTASKS==1 to config_machine.xml for the specific machines we want to do this for.

@ekluzek ekluzek added Responsibility: CESM Responsibility to manage and accomplish this issue is through CESM ty: Discussion tp: CIMElib labels Apr 26, 2024
@ekluzek ekluzek changed the title Set mpirun be None when a single MPI task is being used (similar to handling of mpi-serial) Set mpirun to None when a single MPI task is being used (similar to handling of mpi-serial) Apr 26, 2024
@ekluzek
Copy link
Contributor Author

ekluzek commented Apr 29, 2024

Doing this allows users that are running serially (for example with CTSM single point tower sites such as NEON) to run short cases on personal machines. And not having to figure the complexity of mpirun options makes it easier for users on these machines to get serial cases working.

./case.submit --no-batch

without the change to mpirun the above won't work on machines that don't allow mpirun to be used outside of batch submission.

NOTE: On Derecho, do NOT run on login nodes with above (it'll log you off if you do too much of above). But, you can use qcmd with above as well.

@jedwards4b
Copy link
Contributor

@ekluzek I don't think we should encourage running on the login nodes. Use qcmd or run on casper.

@ekluzek
Copy link
Contributor Author

ekluzek commented Apr 30, 2024

Good point @jedwards4b I've edited my above comment to emphasize the use on non-HPC machines. And give a specific warning for Derecho. Let me know if you have any further suggestions...

@ekluzek
Copy link
Contributor Author

ekluzek commented Apr 30, 2024

Note, the ability to run cases like this with

./case.submit --no-batch

is important for one user who has an unusual use-case. He runs a bunch of cases all at once (multi-instance doesn't work for him because the cases have differences that multi-instance doesn't allow), so it's under batch, but he needs the above to get his setup to work.

Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Jul 30, 2024
Copy link
Contributor

github-actions bot commented Aug 4, 2024

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 4, 2024
@ekluzek ekluzek reopened this Aug 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Low Priority Responsibility: CESM Responsibility to manage and accomplish this issue is through CESM Stale tp: CIMElib ty: Discussion
Projects
None yet
Development

No branches or pull requests

2 participants