[AMD] Fix compilation issue with ROCm #137

bhargav · 2023-09-17T18:43:02Z

Problem:
Unable to install the package on a Linux machine with an AMD 6800XT GPU using ROCm.

docker run -it --device=/dev/kfd --device=/dev/dri --group-add video docker.io/rocm/dev-ubuntu-22.04:5.6-complete bash

CT_HIPBLAS=1 pip install ctransformers --no-binary ctransformers

# Compiling after setting CC and CXX env variables also failed with a similar error.
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ CT_HIPBLAS=1 pip install ctransformers --no-binary ctransformers

Error logs: https://gist.github.com/bhargav/7f8c2984ba32ff99ce8e93433d9059a6

Solution:
Failures are due to references to CUDA library imports instead of using the HIP versions when compiled for AMD.

Verified that the project can build with the fixes.

apt-get update && apt-get install -y git

git clone https://github.com/bhargav/ctransformers.git
cd ctransformers

git checkout bhargav/fix_rocm_compile

CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ CT_HIPBLAS=1 pip install .

Build log: https://gist.github.com/bhargav/65bbbd039bda6f39504448656e88ab6b

Package installs successfully and I was able to run a model inference on GPU.

mega-ice · 2023-09-20T11:28:26Z

i can confirm. Compile without a issue now. (rocm nightly and old vega64 :)

ahmedashraf093 · 2023-10-11T00:21:18Z

Hello,

Installing done without error but when trying to prompt the model using

llm = AutoModelForCausalLM.from_pretrained("/models/llama-7b.Q3_K_M.gguf", model_type="llama" , local_files_only=True, gpu_layers=100)

print(llm("AI is going to"))

exits with error

CUDA error 98 at ~/ctransformers/models/ggml/ggml-cuda.cu:6045: invalid device function

mega-ice · 2023-10-14T08:29:59Z

When using ROCm, "CUDA error 98 ....... invalid device function" (as far as I know) usually means version implementation problems in the HIP stack. Most likely it's solvable with

export HSA_OVERRIDE_GFX_VERSION=11.0.0 for rdna3 gpu
export HSA_OVERRIDE_GFX_VERSION=10.3.0 for older one

afazekas · 2023-11-26T15:04:42Z

Hi,
In the referenced container version it can work with a gfx1030 system.
docker.io/rocm/dev-ubuntu-22.04:5.6-complete

apt show rocm-libs -a

Package: rocm-libs
Version: 5.6.0.50600-67~22.04
Priority: optional
Section: devel

But in "rocm/pytorch:latest-release"

apt show rocm-libs -a
Package: rocm-libs
Version: 5.7.0.50700-63~20.04
Priority: optional

Raises the" invalid device function" Error.

NOTE:
If you import the cloned directory by accident instead of the installed package:
OSError: libcudart.so.12: cannot open shared object file: No such file or directory

NOTE:
I disabled the integrated gfx1036 GPU from the CPU, so I have only the gfx1030 system.

I assume the "invalid device function" depends on the environment's library version(s).

/lgtm

0xGingi · 2023-12-14T21:17:47Z

CUDA error 98 at /home/gingi/github/ctransformers/models/ggml/ggml-cuda.cu:6045: invalid device function

I have export HSA_OVERRIDE_GFX_VERSION=11.0.0
and running HSA_OVERRIDE_GFX_VERSION=11.0.0 python index.py

huotianyu · 2024-03-20T03:39:07Z

the below command worked well！
CMAKE_ARGS="-DLLAMA_HIPBLAS=on -DLLAMA_CLBLAST=on -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DCMAKE_PREFIX_PATH=/opt/rocm" FORCE_CMAKE=1 pip install ctransformers --no-binary ctransformers

Fix compilation issue with ROCm

18e9826

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] Fix compilation issue with ROCm #137

[AMD] Fix compilation issue with ROCm #137

bhargav commented Sep 17, 2023

mega-ice commented Sep 20, 2023

ahmedashraf093 commented Oct 11, 2023

mega-ice commented Oct 14, 2023 •

edited

Loading

afazekas commented Nov 26, 2023

0xGingi commented Dec 14, 2023 •

edited

Loading

huotianyu commented Mar 20, 2024

[AMD] Fix compilation issue with ROCm #137

Are you sure you want to change the base?

[AMD] Fix compilation issue with ROCm #137

Conversation

bhargav commented Sep 17, 2023

mega-ice commented Sep 20, 2023

ahmedashraf093 commented Oct 11, 2023

mega-ice commented Oct 14, 2023 • edited Loading

afazekas commented Nov 26, 2023

0xGingi commented Dec 14, 2023 • edited Loading

huotianyu commented Mar 20, 2024

mega-ice commented Oct 14, 2023 •

edited

Loading

0xGingi commented Dec 14, 2023 •

edited

Loading