-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Cannot allocate pinned memory" error on a supercomputer #316
Comments
CUDA v12.3 doesn't support Kepler architecture. CUDA v12.3 also requires more recent driver version: |
Thank you for the heads up! I've now changed the version of CUDA to 11.4. After recompiling both AMGX and my program, however, the same problem persists:
And in the slurm out file:
|
I wonder if it's somehow related to what another user reported in this issue: #313 First thing - does your environment support locked memory? You can try running an example that tries to allocate same amount of pinned memory to see if it's environment issue, something like this: https://godbolt.org/z/7ab86qc34 |
Interestingly, the example ran without any issues, so it doesn't seem like a environment problem. When trying AMGX examples though, I ran into the same problem i had in my application:
|
Can you check last small thing - there is still this in the output:
can you check that CUDA 11.4 is actually used in the runtime? Sorry for misleading output, since that message actually means what version is being used at the runtime ( https://github.com/NVIDIA/AMGX/blob/main/src/core.cu#L738-L751 ) |
I am encountering a "Cannot allocate pinned memory" error while running a program that uses AMGX solvers on a supercomputer that uses the SLURM Workload Manager. The program fails to allocate the necessary pinned memory for efficient GPU memory transfers.
Here's the full output file:
Here's the SLURM output:
And those were the SLURM options utilized in this test:
System info
Operating System: Linux Red Hat 7.9
CUDA Version: 12.3
GCC Version: 9.3
MPI Version: 3.4
AMGX Version: 2.5.0
GPU Model: NVIDIA Tesla K40t
NVIDIA Driver Version: 470.82.01
Any guidance or suggestions on resolving this issue would be greatly appreciated. Thank you!
The text was updated successfully, but these errors were encountered: