-
Notifications
You must be signed in to change notification settings - Fork 843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors when running mpi programs #12520
Comments
Is
We can rule out libfabric with additional mca parameters
This prevents libfabric from being used. |
Hi, with this command Thanks, |
Thanks for checking. Just to clarify, do you intend to use libfabric at all? I wonder how libfabric is configured on your system - we can move the discussion to the libfabric community if you desire so.
|
OK, Best regards, |
The libfabric community would need more information to investigate the issue. As a starter, you can turn on the relevant verbose configurations in mpirun
|
Thank you for taking the time to submit an issue!
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
The version of openmpi is 5.0.2
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
It was installed on Fedora 40 hosts with the command
dnf install openmpi openmpi-devel
I don't know if it is relevant, in Fedora 40 the openmpi library is linked to libfabric
If you are building/installing from a git clone, please copy-n-paste the output from
git submodule status
.Please describe the system on which you are running
Fedora 40
Shared vCPU on hetzner.com
All the nodes have lo and eth0 interfaces
Details of the problem
I cannot run an mpi program on a 3-node cluster with ip addresses 195.201.223.246, 162.55.213.49 and 88.198.157.233
When I run
shell$ mpirun -np 16 --hostfile ~/hosts ./mpi02
I get errors of the form
The contents of the hosts file are
Best regards,
Rafel Amer
The text was updated successfully, but these errors were encountered: