Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. #66750

Open
Fchaubard opened this issue Apr 30, 2024 · 2 comments
Assignees
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF 2.13 For issues related to Tensorflow 2.13 type:build/install Build and install issues

Comments

@Fchaubard
Copy link

Issue type

Build/Install

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

2.13.1

Custom code

No

OS platform and distribution

Ubuntu 20.04.6

Mobile device

No response

Python version

3.8.10

Bazel version

No response

GCC/compiler version

9.4.0

CUDA/cuDNN version

CUDA 12.4

GPU model and memory

GEFORCE GTX 4090

Current behavior?

tensorflow does not detect the CUDA drivers but they are there and added to PATH.

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
2024-04-30 16:29:17.328463: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-30 16:29:17.362495: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-30 16:29:17.362915: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-30 16:29:18.094331: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-04-30 16:29:18.749023: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1960] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[]

nvidia-smi returns this suggesting the drivers are there and working:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.78 Driver Version: 550.78 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:03:00.0 Off | Off |
| 0% 36C P8 12W / 450W | 29MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1089 G /usr/lib/xorg/Xorg 9MiB |
| 0 N/A N/A 1142 G /usr/bin/gnome-shell 8MiB |
+-----------------------------------------------------------------------------------------+

Even pytorch recognizes the drivers without issue:

Python 3.8.10 (default, Nov 22 2023, 10:22:35)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import torch
torch.cuda.device_count()
1
torch.cuda.get_device_name(0)
'NVIDIA GeForce RTX 4090'


But not tensorflow? Please help!

Standalone code to reproduce the issue

You will need to recreate the environment and then:
pip install tensorflow

Relevant log output

No response

@google-ml-butler google-ml-butler bot added the type:build/install Build and install issues label Apr 30, 2024
@tilakrayal tilakrayal added TF 2.13 For issues related to Tensorflow 2.13 subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues labels May 2, 2024
@tilakrayal
Copy link
Contributor

@Fchaubard,
I suspect you are trying to install the every tensorflow version on CUDA 12.4 only which is not compatible. Could you please uninstall the installed libraries and try to install Clang 16.0.0, Bazel 5.3.0, cuDNN - 8.6, CUDA - 11.8 for the tensorflow v2.13 and check the below process.

- Make sure cuda and cudnn are installed correctly.
- Make sure $Env:CUDA_PATH is given correctly
- Make sure $Env:LD_LIBRARY_PATH is given correctly
- Make sure $Env:TF_CUDA_PATHS is given correctly

https://www.tensorflow.org/install/source#gpu

Thank you!

@tilakrayal tilakrayal added the stat:awaiting response Status - Awaiting response from author label May 2, 2024
Copy link

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF 2.13 For issues related to Tensorflow 2.13 type:build/install Build and install issues
Projects
None yet
Development

No branches or pull requests

2 participants