Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMD] Fix compilation issue with ROCm #137

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bhargav
Copy link

@bhargav bhargav commented Sep 17, 2023

Problem:
Unable to install the package on a Linux machine with an AMD 6800XT GPU using ROCm.

docker run -it --device=/dev/kfd --device=/dev/dri --group-add video docker.io/rocm/dev-ubuntu-22.04:5.6-complete bash

CT_HIPBLAS=1 pip install ctransformers --no-binary ctransformers

# Compiling after setting CC and CXX env variables also failed with a similar error.
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ CT_HIPBLAS=1 pip install ctransformers --no-binary ctransformers

Error logs: https://gist.github.com/bhargav/7f8c2984ba32ff99ce8e93433d9059a6

Solution:
Failures are due to references to CUDA library imports instead of using the HIP versions when compiled for AMD.

Verified that the project can build with the fixes.

apt-get update && apt-get install -y git

git clone https://github.com/bhargav/ctransformers.git
cd ctransformers

git checkout bhargav/fix_rocm_compile

CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ CT_HIPBLAS=1 pip install .

Build log: https://gist.github.com/bhargav/65bbbd039bda6f39504448656e88ab6b

Package installs successfully and I was able to run a model inference on GPU.

@mega-ice
Copy link

i can confirm. Compile without a issue now. (rocm nightly and old vega64 :)

@ahmedashraf093
Copy link

Hello,

Installing done without error but when trying to prompt the model using

llm = AutoModelForCausalLM.from_pretrained("/models/llama-7b.Q3_K_M.gguf", model_type="llama" , local_files_only=True, gpu_layers=100)

print(llm("AI is going to"))

exits with error

CUDA error 98 at ~/ctransformers/models/ggml/ggml-cuda.cu:6045: invalid device function

@mega-ice
Copy link

mega-ice commented Oct 14, 2023

When using ROCm, "CUDA error 98 ....... invalid device function" (as far as I know) usually means version implementation problems in the HIP stack. Most likely it's solvable with

export HSA_OVERRIDE_GFX_VERSION=11.0.0 for rdna3 gpu
export HSA_OVERRIDE_GFX_VERSION=10.3.0 for older one

@afazekas
Copy link

Hi,
In the referenced container version it can work with a gfx1030 system.
docker.io/rocm/dev-ubuntu-22.04:5.6-complete

apt show rocm-libs -a

Package: rocm-libs
Version: 5.6.0.50600-67~22.04
Priority: optional
Section: devel

But in "rocm/pytorch:latest-release"

apt show rocm-libs -a
Package: rocm-libs
Version: 5.7.0.50700-63~20.04
Priority: optional

Raises the" invalid device function" Error.

NOTE:
If you import the cloned directory by accident instead of the installed package:
OSError: libcudart.so.12: cannot open shared object file: No such file or directory

NOTE:
I disabled the integrated gfx1036 GPU from the CPU, so I have only the gfx1030 system.

I assume the "invalid device function" depends on the environment's library version(s).

/lgtm

@0xGingi
Copy link

0xGingi commented Dec 14, 2023

CUDA error 98 at /home/gingi/github/ctransformers/models/ggml/ggml-cuda.cu:6045: invalid device function

I have export HSA_OVERRIDE_GFX_VERSION=11.0.0
and running HSA_OVERRIDE_GFX_VERSION=11.0.0 python index.py

@huotianyu
Copy link

the below command worked well!
CMAKE_ARGS="-DLLAMA_HIPBLAS=on -DLLAMA_CLBLAST=on -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DCMAKE_PREFIX_PATH=/opt/rocm" FORCE_CMAKE=1 pip install ctransformers --no-binary ctransformers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants