Issues with Cuda 9.0 #13

uvilla · 2020-01-22T00:52:18Z

Hi there,

I am having a very strange issue running the CUDA variant of LULESH (release of 2.0.2).

I'm compiling using Cuda compilation tools, release 9.0, V9.0.176 and setting either the flag -arch=sm_35 or, to avoid compilation warnings, the flag -arch=sm_70.

When running the code on a Tesla V100-SXM2-32GB, the program crash as follows:

$ ./lulesh -s 10
Host compute1-exec-206.ris.wustl.edu using GPU 0: Tesla V100-SXM2-32GB
terminate called after throwing an instance of 'thrust::system::system_error'
  what():  parallel_for failed: invalid argument
[compute1-exec-206:00204] *** Process received signal ***
[compute1-exec-206:00204] Signal: Aborted (6)
[compute1-exec-206:00204] Signal code:  (-6)
[compute1-exec-206:00204] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f7e38bda390]
[compute1-exec-206:00204] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7f7e37d8f428]
[compute1-exec-206:00204] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7f7e37d9102a]
[compute1-exec-206:00204] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x16d)[0x7f7e386d284d]
[compute1-exec-206:00204] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8d6b6)[0x7f7e386d06b6]
[compute1-exec-206:00204] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8d701)[0x7f7e386d0701]
[compute1-exec-206:00204] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8d919)[0x7f7e386d0919]
[compute1-exec-206:00204] [ 7] ./lulesh[0x41f252]
[compute1-exec-206:00204] [ 8] ./lulesh[0x417330]
[compute1-exec-206:00204] [ 9] ./lulesh[0x41ade5]
[compute1-exec-206:00204] [10] ./lulesh[0x405cff]
[compute1-exec-206:00204] [11] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f7e37d7a830]
[compute1-exec-206:00204] [12] ./lulesh[0x409bf9]
[compute1-exec-206:00204] *** End of error message ***
Aborted (core dumped)

As anyone else observed or reported something similar?
What version of CUDA do you usually use to compile LULESH?

Thank you in advance,

Umberto

The text was updated successfully, but these errors were encountered:

miharulidze · 2020-05-13T23:27:21Z

@uvilla I may be a little bit late here, but I observed exactly the same problem on exactly the same hardware (V100-SXM2).

It looks like this code is not maintained for a very long time.

Good news are that disabling MPI support in Makefile helps.

ikarlin · 2020-05-14T00:30:28Z

Yes the CUDA version is not maintained. It was a Nvidia port. and Nvidia has not been updating it. The mainline code is maintained.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with Cuda 9.0 #13

Issues with Cuda 9.0 #13

uvilla commented Jan 22, 2020

miharulidze commented May 13, 2020

ikarlin commented May 14, 2020

Issues with Cuda 9.0 #13

Issues with Cuda 9.0 #13

Comments

uvilla commented Jan 22, 2020

miharulidze commented May 13, 2020

ikarlin commented May 14, 2020