Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVIDIA Driver #1

Open
wyedoubleyou opened this issue May 14, 2023 · 8 comments
Open

NVIDIA Driver #1

wyedoubleyou opened this issue May 14, 2023 · 8 comments

Comments

@wyedoubleyou
Copy link

Hi, I wish to know if which version of Ubuntu and NVIDIA driver is compatible to run this project. Because I had problem on installing NVIDIA driver 470 in Ubuntu ver. 18.04 & running this project code in Ubuntu 20.04. Looking forward to the reply

@izakharkin
Copy link
Contributor

Hello @yfung0, could you please provide the details of the problem you have while running the project?

@wyedoubleyou
Copy link
Author

Hii @izakharkin, thank you so much for the reply.

I am using MSI GF63-Thin-11UD laptop, with GPU GeForce RTX 3050 Ti. I am using Ubuntu 20.04, installed Nvidia-Driver 470, CUDA Toolkit version 11.4, Cudnn version 8.2.2.

I had also completed the Setup, Build docker and Download data section.

When I executing the command python fit_outfit_code.py --config_name=outfit_code/psp in the docker container. It shows error as below:

(pbc) user@yw-GF63-Thin-11UD:/mounted/home/yw/Desktop/work/point_based_clothing$ python fit_outfit_code.py --config_name=outfit_code/psp

Traceback (most recent call last):
File "fit_outfit_code.py", line 32, in
from models.draping_network_wrapper import Wrapper
File "/mounted/home/yw/Desktop/work/point_based_clothing/src/models/draping_network_wrapper.py", line 10, in
from .draping_network import DrapingNetwork
File "/mounted/home/yw/Desktop/work/point_based_clothing/src/models/draping_network.py", line 8, in
from cloud_transformers.layers.multihead_ct_adain import MultiHeadUnionAdaIn, forward_style
File "/mounted/home/yw/Desktop/work/point_based_clothing/cloud_transformers/layers/multihead_ct_adain.py", line 4, in
from layers.cloud_transform import Splat, Slice, DifferentiablePositions
File "/mounted/home/yw/Desktop/work/point_based_clothing/cloud_transformers/layers/cloud_transform.py", line 7, in
from torch_scatter import scatter_max
File "/home/user/miniconda/envs/pbc/lib/python3.7/site-packages/torch_scatter/init.py", line 16, in
torch.ops.load_library(spec.origin)
File "/home/user/miniconda/envs/pbc/lib/python3.7/site-packages/torch/_ops.py", line 573, in load_library
ctypes.CDLL(path)
File "/home/user/miniconda/envs/pbc/lib/python3.7/ctypes/init.py", line 364, in init
self._handle = _dlopen(self._name, mode)
OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory

Looking forward to your help.

@izakharkin
Copy link
Contributor

Thank you for providing the details. Could you please also share the output of these commands inside the docker:

  1. nvidia-smi
  2. cat /usr/local/cuda/version.txt

@wyedoubleyou
Copy link
Author

wyedoubleyou commented May 17, 2023

Here is the output :

(pbc) user@yw-GF63-Thin-11UD:/mounted/home/yw/Desktop/work/point_based_clothing$ nvidia-smi

Wed May 17 15:07:42 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.182.03 Driver Version: 470.182.03 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| N/A 40C P0 6W / N/A | 9MiB / 3913MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1186 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 1973 G /usr/lib/xorg/Xorg 4MiB |
+-----------------------------------------------------------------------------+

(pbc) user@yw-GF63-Thin-11UD:/mounted/home/yw/Desktop/work/point_based_clothing$ cat /usr/local/cuda/version.txt
CUDA Version 10.1.243

@izakharkin
Copy link
Contributor

Ok thank you, I will check it by myself

@arjundheek
Copy link

Is the issue solved?

@arjundheek
Copy link

Is this code really working?

@sahilshukla3003
Copy link

Hii @izakharkin, thank you so much for the reply.

I am using MSI GF63-Thin-11UD laptop, with GPU GeForce RTX 3050 Ti. I am using Ubuntu 20.04, installed Nvidia-Driver 470, CUDA Toolkit version 11.4, Cudnn version 8.2.2.

I had also completed the Setup, Build docker and Download data section.

When I executing the command python fit_outfit_code.py --config_name=outfit_code/psp in the docker container. It shows error as below:

(pbc) user@yw-GF63-Thin-11UD:/mounted/home/yw/Desktop/work/point_based_clothing$ python fit_outfit_code.py --config_name=outfit_code/psp

Traceback (most recent call last): File "fit_outfit_code.py", line 32, in from models.draping_network_wrapper import Wrapper File "/mounted/home/yw/Desktop/work/point_based_clothing/src/models/draping_network_wrapper.py", line 10, in from .draping_network import DrapingNetwork File "/mounted/home/yw/Desktop/work/point_based_clothing/src/models/draping_network.py", line 8, in from cloud_transformers.layers.multihead_ct_adain import MultiHeadUnionAdaIn, forward_style File "/mounted/home/yw/Desktop/work/point_based_clothing/cloud_transformers/layers/multihead_ct_adain.py", line 4, in from layers.cloud_transform import Splat, Slice, DifferentiablePositions File "/mounted/home/yw/Desktop/work/point_based_clothing/cloud_transformers/layers/cloud_transform.py", line 7, in from torch_scatter import scatter_max File "/home/user/miniconda/envs/pbc/lib/python3.7/site-packages/torch_scatter/init.py", line 16, in torch.ops.load_library(spec.origin) File "/home/user/miniconda/envs/pbc/lib/python3.7/site-packages/torch/_ops.py", line 573, in load_library ctypes.CDLL(path) File "/home/user/miniconda/envs/pbc/lib/python3.7/ctypes/init.py", line 364, in init self._handle = _dlopen(self._name, mode) OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory

Looking forward to your help.

follow below steps

  1. sudo find / -name "libcudart.so.11.0"
  2. copy the path of libcudart.so.11.0 from your env
  3. export LD_LIBRARY_PATH=/home/zzt-101/anaconda3/envs/cotton_env/lib:$LD_LIBRARY_PATH
  4. put your path here instead of /home/zzt-101/anaconda3/envs/cotton_env/lib

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants