AMD GPUs (ROCm) support #636

alissonlauffer · 2023-10-25T17:03:39Z

I'd be a great addition if we could also run the project GPU-accelerated in AMD GPUs too. Thanks!

Please reply with a 👍 if you want this feature.

cromefire · 2023-11-26T03:34:47Z

Experimental support for ROCm enabled GPUs should now me available in #895 . You'll need to build the container yourself though.

wwayne · 2024-01-02T06:52:22Z

If you are looking for how to enable ROCm, plz take a look in here
https://slack.tabbyml.com/elpZRnVmD6j

nilsocket · 2024-01-02T10:44:17Z

If you are looking for how to enable ROCm, plz take a look in here https://slack.tabbyml.com/elpZRnVmD6j

I'm just trying to install.
Thanks a lot.

cromefire · 2024-01-02T10:46:18Z

There's also a proper Linux container you can use in the other branch, but it wasn't merged...
https://github.com/cromefire/tabby/blob/rocm-support/rocm.Dockerfile

It'll freshly build it and build an optimized docker version with the latest stuff (instead of using some old manylinux stuff) and only the ROCm parts you actually need.

nilsocket · 2024-01-02T11:05:34Z

Oh, Thanks alot. If you don't mind can you mention the branch name.

…

On Tue, Jan 2, 2024, 4:16 PM Cromefire_ ***@***.***> wrote: There's also a proper Linux container you can use in the other branch, but it wasn't merged... — Reply to this email directly, view it on GitHub <#636 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHDPHHG2GYLZM7NLKX5K7CDYMPQQLAVCNFSM6AAAAAA6PW3FY2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZTHA3DOMZTGM> . You are receiving this because you commented.Message ID: ***@***.***>

cromefire · 2024-01-02T11:08:09Z

There's also a proper Linux container you can use in the other branch, but
it wasn't merged...

Oh, Thanks alot.

If you don't mind can you mention the branch name.

rocm-support, see the link above.

nilsocket · 2024-01-02T11:10:05Z

@cromefire , checked the message from gmail, so missed.

Once again.
Thanks.

nilsocket · 2024-01-02T13:42:16Z

@cromefire Any specific flags need to be added?

Unable to start it:

thread 'main' panicked at /root/workspace/crates/tabby-common/src/registry.rs:52:21:
Failed to fetch model organization <TabbyML>: error sending request for url (https://raw.githubusercontent.com/TabbyML/registry-tabby/main/models.json): error trying to connect: error:16000069:STORE routines:ossl_store_get0_loader_int:unregistered scheme:../crypto/store/store_register.c:237:scheme=file, error:80000002:system library:file_open:reason(2):../providers/implementations/storemgmt/file_store.c:267:calling stat(/usr/lib/ssl/certs), error:16000069:STORE routines:ossl_store_get0_loader_int:unregistered scheme:../crypto/store/store_register.c:237:scheme=file, error:80000002:system library:file_open:reason(2):../providers/implementations/storemgmt/file_store.c:267:calling stat(/usr/lib/ssl/certs), error:16000069:STORE routines:ossl_store_get0_loader_int:unregistered scheme:../crypto/store/store_register.c:237:scheme=file, error:80000002:system library:file_open:reason(2):../providers/implementations/storemgmt/file_store.c:267:calling stat(/usr/lib/ssl/certs), error:0A000086:SSL routines:tls_post_process_server_certificate:certificate verify failed:../ssl/statem/statem_clnt.c:1883: (unable to get local issuer certificate)

I ran it with these options:

docker run -it --device /dev/kfd --device /dev/dri/card1 --security-opt seccomp=unconfined --group-add video -p 8080:8080 -v $HOME/.tabby:/data tabby:copilot serve --model TabbyML/DeepseekCoder-6.7B --device rocm

cromefire · 2024-01-02T14:37:33Z

That means it's missing ca-certificates, I thought I've fixed that, but apparently not, for the time being you can just map your system CA certs into the container and I'll have a look later, maybe it's not committed yet.

nilsocket · 2024-01-02T14:40:29Z

Sure, Thanks for than info.

…

On Tue, Jan 2, 2024, 8:07 PM Cromefire_ ***@***.***> wrote: That means it's missing ca-certificates, I thought I've fixed that, but apparently not, for the time being you can just map your system CA certs into the container and I'll have a look later, maybe it's not committed yet. — Reply to this email directly, view it on GitHub <#636 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHDPHHAYVTP3FIIC3TDHFM3YMQLTRAVCNFSM6AAAAAA6PW3FY2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZUGEYDKNZVGU> . You are receiving this because you commented.Message ID: ***@***.***>

cromefire · 2024-01-02T17:50:28Z

Should be fine now, I had it fixed, but not committed.

--security-opt seccomp=unconfined --group-add video

Also I think you shouldn't even need those.

nilsocket · 2024-01-04T09:37:56Z

Hi, @cromefire
After pulling from your latest commit, It is working fine now.

Thanks for your help.

alsh · 2024-01-13T23:12:39Z

Hello, @cromefire
I've tried to build your ROCM image. Now, when I try to run tabby, it doesn't seem to use gpu offloading:

curl -X 'GET' \
  'http://localhost:8080/v1/health' \
  -H 'accept: application/json'

{"model":"TabbyML/StarCoder-1B","device":"rocm","arch":"x86_64","cpu_info":"AMD Ryzen 7 3800X 8-Core Processor","cpu_count":16,"accelerators":[],"cuda_devices":[],"version":{"build_date":"2024-01-13","build_timestamp":"2024-01-13T22:38:55.449357661Z","git_sha":"fd0891bd6571e74495c85657b584d7e236d59bd3","git_describe":"fd0891b-dirty"}}

With "accelerators" returning empty list. While inside the docker container, rocminfo recognizes the GPU. What could be wrong?

cromefire · 2024-01-13T23:13:47Z

Did you run it with --device rocm?

alsh · 2024-01-14T17:46:56Z

@cromefire Yes, it was run with --device rocm.
I have used exact command from readme file in your fork:

docker run -it \
  --device /dev/dri --device /dev/kfd \
  -p 8080:8080 -v $HOME/.tabby:/data \
  tabbyml/tabby-rocm \
  serve --model TabbyML/StarCoder-1B --device rocm

The only difference was the image name, which I set to be somewhat different when building the image.

cromefire · 2024-01-14T17:48:21Z

Can you share the output from ROCm info? There's a regex there that has to match (to show up it'll also work if it doesn't show up but it might give a clue)

alsh · 2024-01-14T17:57:02Z

Can you share the output from ROCm info? There's a regex there that has to match (to show up it'll also work if it doesn't show up but it might give a clue)

Here https://gist.github.com/alsh/97c9ad94274abdf2b41a91857f84781e

cromefire · 2024-01-14T18:04:22Z

Well your GPU isn't officially supported by ROCm there's the problem, you can overwrite it to look like a gfx1030 via the override variable (makes it look like a 6900XT; the info on that is in the FAQ). I'll also have a look whether I maybe just can add it as a target...

alsh · 2024-01-14T21:44:22Z

Well your GPU isn't officially supported by ROCm there's the problem, you can overwrite it to look like a gfx1030 via the override variable (makes it look like a 6900XT; the info on that is in the FAQ). I'll also have a look whether I maybe just can add it as a target...

I think I found an issue. This line let cmd_res = Command::new("rocminfo").output()?; tries to launch 'rocminfo', but it is not available on the PATH. ROCM packages as prepared by AMD don't make themselves available on the PATH. And rocm.dockerfile doesn't set the PATH to ROCM either.

cromefire · 2024-01-14T22:34:53Z

Should be easy enough to fix, but that won't fix your problem, as that line just provides the accellerators metadata not the actual acceleration.

alsh · 2024-01-14T23:10:36Z

Should be easy enough to fix, but that won't fix your problem, as that line just provides the accellerators metadata not the actual acceleration.

Yes, I started to search for problems because overriding HSA_OVERRIDE_GFX_VERSION did not fix my problem too.

So, another issue:

missing build dependency - rocm-device-libs.

Missing rocm-device-libs effectively left llama.cpp build without ROCM support, as cmake wasn't able to find AMDDeviceLibsConfig.cmake file (which this package provides).

Now, I've installed rocm-device-libs, and with HSA_OVERRIDE_GFX_VERSION=10.3.0, it finally seems to work!

2024-01-14T23:10:24.274396Z  INFO tabby::serve: crates/tabby/src/serve.rs:116: Starting server, this might take a few minutes...
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 ROCm devices:
  Device 0: AMD Radeon RX 6700 XT, compute capability 10.3
2024-01-14T23:10:27.024540Z  INFO tabby::routes: crates/tabby/src/routes/mod.rs:35: Listening at 0.0.0.0:8080

cromefire · 2024-01-14T23:11:44Z

Fixed rocminfo in the container and while checking out the code locally it seems like you GPU should already have been working. Did you actually check whether it uses you GPU or did you rely on the accellerators metadata? (because that isn't accurate at all, on whether it works)

Also it's now possible to use:

docker build --build-arg AMDGPU_TARGETS="$(/opt/rocm/bin/offload-arch | tr '\n' ';')" -t tabby-docker-rocm -f rocm.Dockerfile .

to build an optimized image that only works on your GPU.

So, another issue:

missing build dependency - rocm-device-libs.

Interesting, I don't have that issue I think, I'll investigate.

Edit: Yeah something is wrong here and it just falls back on the CPU, but I couldn't get it working by installing rocm-device-libs (it's already installed anyway). It seems like something went wrong when I reduced the size of the container, fallbacks on the CPU really suck...

cromefire · 2024-01-15T01:58:36Z

Okay it should work now (at least it finds the GPU now)... and should have a safeguard against any future issues of this kind as well, although it doesn't want to return a result... maybe I should try ROCm 5 for now, ROCm 6 is still very very fresh it seems and the llama.cpp copy of tabby is pretty old now...

alsh · 2024-01-15T13:53:53Z

Okay it should work now (at least it finds the GPU now)... and should have a safeguard against any future issues of this kind as well, although it doesn't want to return a result... maybe I should try ROCm 5 for now, ROCm 6 is still very very fresh it seems and the llama.cpp copy of tabby is pretty old now...

Yes, now it looks good, and I'm able to start this docker image with GPU support.

PS:
As for AMDGPU_TARGETS - it's probably of no use, as listing non-supported (officially) targets will not make it work for those GPUs. It will still crash at runtime, like:

rocBLAS error: Cannot read /opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary.dat: Illegal seek for GPU arch : gfx1031
 List of available TensileLibrary Files : 
"/opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary_lazy_gfx1030.dat"
"/opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary_lazy_gfx1100.dat"
"/opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary_lazy_gfx1101.dat"
"/opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary_lazy_gfx1102.dat"
"/opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary_lazy_gfx900.dat"
"/opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary_lazy_gfx906.dat"
"/opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary_lazy_gfx908.dat"
"/opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary_lazy_gfx90a.dat"
"/opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary_lazy_gfx940.dat"
"/opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary_lazy_gfx941.dat"
"/opt/rocm-6.0.0/lib/rocblas/library/TensileLibrary_lazy_gfx942.dat"
Aborted (core dumped)

So, the HSA_OVERRIDE_GFX_VERSION seems to be the only option to go.

cromefire · 2024-01-15T13:58:23Z

As for AMDGPU_TARGETS - it's probably of no use, as listing non-supported (officially) targets will not make it work for those GPUs. It will still crash at runtime, like:

I know (for now), but it makes the build a lot faster as it only builds your GPU. Otherwise it will compile the code for all GPU targets, which takes forever. In your case you'd just put gfx1030 there. Long term maybe we should build it from source and then any GPU supported by LLVM, of the same family should work.

alissonlauffer added the enhancement New feature or request label Oct 25, 2023

alissonlauffer changed the title ~~AMD (ROCm) GPUs support~~ AMD GPUs (ROCm) support Oct 26, 2023

cromefire mentioned this issue Nov 25, 2023

feat: Add support Intel #895

Closed

cromefire mentioned this issue Nov 26, 2023

feat: Add Support for Devices other than CUDA in Telemetry and Web UI #902

Closed

wsxiaoys mentioned this issue Nov 29, 2023

feat: add rocm support #913

Merged

cromefire mentioned this issue Dec 10, 2023

feat: Add rocm builds and documentation #1012

Merged

wsxiaoys closed this as completed in #1012 Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMD GPUs (ROCm) support #636

AMD GPUs (ROCm) support #636

alissonlauffer commented Oct 25, 2023

cromefire commented Nov 26, 2023 •

edited

Loading

wwayne commented Jan 2, 2024

nilsocket commented Jan 2, 2024

cromefire commented Jan 2, 2024 •

edited

Loading

nilsocket commented Jan 2, 2024 via email

cromefire commented Jan 2, 2024

nilsocket commented Jan 2, 2024

nilsocket commented Jan 2, 2024

cromefire commented Jan 2, 2024

nilsocket commented Jan 2, 2024 via email

cromefire commented Jan 2, 2024 •

edited

Loading

nilsocket commented Jan 4, 2024

alsh commented Jan 13, 2024

cromefire commented Jan 13, 2024

alsh commented Jan 14, 2024 •

edited

Loading

cromefire commented Jan 14, 2024 •

edited

Loading

alsh commented Jan 14, 2024 •

edited

Loading

cromefire commented Jan 14, 2024 •

edited

Loading

alsh commented Jan 14, 2024

cromefire commented Jan 14, 2024

alsh commented Jan 14, 2024 •

edited

Loading

cromefire commented Jan 14, 2024 •

edited

Loading

cromefire commented Jan 15, 2024 •

edited

Loading

alsh commented Jan 15, 2024

cromefire commented Jan 15, 2024 •

edited

Loading

AMD GPUs (ROCm) support #636

AMD GPUs (ROCm) support #636

Comments

alissonlauffer commented Oct 25, 2023

cromefire commented Nov 26, 2023 • edited Loading

wwayne commented Jan 2, 2024

nilsocket commented Jan 2, 2024

cromefire commented Jan 2, 2024 • edited Loading

nilsocket commented Jan 2, 2024 via email

cromefire commented Jan 2, 2024

nilsocket commented Jan 2, 2024

nilsocket commented Jan 2, 2024

cromefire commented Jan 2, 2024

nilsocket commented Jan 2, 2024 via email

cromefire commented Jan 2, 2024 • edited Loading

nilsocket commented Jan 4, 2024

alsh commented Jan 13, 2024

cromefire commented Jan 13, 2024

alsh commented Jan 14, 2024 • edited Loading

cromefire commented Jan 14, 2024 • edited Loading

alsh commented Jan 14, 2024 • edited Loading

cromefire commented Jan 14, 2024 • edited Loading

alsh commented Jan 14, 2024

cromefire commented Jan 14, 2024

alsh commented Jan 14, 2024 • edited Loading

cromefire commented Jan 14, 2024 • edited Loading

cromefire commented Jan 15, 2024 • edited Loading

alsh commented Jan 15, 2024

cromefire commented Jan 15, 2024 • edited Loading

cromefire commented Nov 26, 2023 •

edited

Loading

cromefire commented Jan 2, 2024 •

edited

Loading

cromefire commented Jan 2, 2024 •

edited

Loading

alsh commented Jan 14, 2024 •

edited

Loading

cromefire commented Jan 14, 2024 •

edited

Loading

alsh commented Jan 14, 2024 •

edited

Loading

cromefire commented Jan 14, 2024 •

edited

Loading

alsh commented Jan 14, 2024 •

edited

Loading

cromefire commented Jan 14, 2024 •

edited

Loading

cromefire commented Jan 15, 2024 •

edited

Loading

cromefire commented Jan 15, 2024 •

edited

Loading