-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INTERNAL: bitcode module not found at ./opencl.bc when running with "TF_XLA_FLAGS=--tf_xla_auto_jit=2" #1591
Comments
Please let me know if there's anything else I may be able to contribute in order to resolve this issue. |
Ok after doing some programming which refreshed my knowledge of how executables look for missing files on Linux in general, I discovered a pretty hacky work-around for this issue at hand:
So the tensorflow rocm build is simply NOT looking in the correct directory for the bitcode files, which are under the Btw for others running into the same problem as I am, YMMV on which exact bitcode files should be linked to the current working directory based on what GPU you have (I have a gfx1030-based RX 6800). |
I am getting the exact same error message, however, that happens even without any environment variables. I am unable to run tensorflow-rocm on a 6900xt under rocm 5.4.0. It used to work just fine previously. There are similar reports here: ROCm/ROCm#1796 Any idea on how to get it to work again? Are you symlinking the .bc files? Or what exactly are you proposing as a hacky solution? Right now, tensorflow is unusable on RDNA3 cards as far as I can tell. |
@Mushoz yes I am simply symlinking the appropriate files into the current working directory as shown in my previous comment; ymmv as to which exact files to symlink (I just kept symlinking each file each error told me it was looking for until all errors went away) bc it appears to be architecture dependent. Sorry to hear that you're running into even worse issues and hopefully my solution helps to fix them 🥺 |
Cheers, that worked wonderfully! I really wonder why this isn't reported by more people. A simple model with just one dense layer with some randomly generated features and targets refuses to run on my 6900XT, so even in the most simple of cases it's completely broken without symlinking. I did not have to do that previously, so that's a big regression. This is all without any switches, just a purely stock tensorflow-rocm installation and execution. |
hey, so i was having this same issue Radeon Pro VII (gfx906) on ubuntu 22.04 using rocm 5.4.1 and it turns out that if i set the |
@tedliosu I can also confirm this issue with my 6800XT and your solution working for me as well. Seems like there should be an environment variable that should resolve what is essentially a path problem. Updating the FYI, I am observing this problem with ROCM 5.4.1. |
I used to have this issue taken care by setting |
@jasondrusso Unfortunately, as maintenance for the original code-base used to reproduce this issue has long been abandoned (see this comment for more info), I am no longer able to test whether or not setting
So if you don't mind, could you please provide a minimal working example of the code that you were working with that led you to the same error that I originally arrived at as well? Otherwise I unfortunately can't help confirm whether or not this issue is purely a user configuration issue 🙁 Thanks in advance 😃 |
Since I broke the system containing my RX 6800 while attempting to upgrade its system memory, and no longer have the time nor energy to maintain my own desktop system, I just sold my RX 6800 (my only AMD GPU). Therefore, since I will not be able to repro any potential fix of this issue anymore, I am closing this issue for the time being. Will be more than willing to reopen this if anyone else runs into the same issue as me. |
sorry pressed wrong button closing now |
System information
You can collect some of this information using our environment capture
script
You can also obtain the TensorFlow version with:
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
Describe the current behavior
git clone https://github.com/tensorflow/benchmarks.git
cd ./benchmarks/scripts/tf_cnn_benchmarks/
TF_XLA_FLAGS=--tf_xla_auto_jit=2 TF_ROCM_FUSION_ENABLE=1 python3 tf_cnn_benchmarks.py --num_gpus=1 --batch_size=128 --model=resnet50
results in the following error:
Describe the expected behavior
tf_cnn_benchmarks.py
; the errors did not appear with ROCm version 4.5.2 and tensorflow-rocm version 2.7.0 (I've tried using tensorflow-rocm version 2.7.0 and version 2.7.1 with ROCm version 5.0.1, but tensorflow complained that it couldn't find "libamdhip64.so.4")Contributing
Standalone code to reproduce the issue
Please refer to steps above in reproducing issue to
git clone
the code from GitHub.The text was updated successfully, but these errors were encountered: