Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Support for Devices other than CUDA in Telemetry and Web UI #902

Closed
wants to merge 31 commits into from

Conversation

cromefire
Copy link
Contributor

Realized via AMD hipBLAS (ROCm/HIP)

Fixes #636

@cromefire cromefire changed the title Add support for AMD hardware feat: Add support for AMD hardware Nov 26, 2023
@cromefire
Copy link
Contributor Author

Okay, the GPU name is also in the UI now, it's more like the ID currently, but I've opened an issue for the library to get the marketing (user recognizable) name: PTFOPlayer/rocm_smi_lib_rs#2

@cromefire
Copy link
Contributor Author

Documentation is also there now, it should be pretty okay-ish for a start.

Copy link
Member

@wsxiaoys wsxiaoys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

First round of comments focusing on source code

crates/tabby/src/main.rs Outdated Show resolved Hide resolved
crates/tabby/src/services/health.rs Outdated Show resolved Hide resolved
crates/llama-cpp-bindings/build.rs Outdated Show resolved Hide resolved
.github/workflows/docker.yml Outdated Show resolved Hide resolved
crates/tabby/src/main.rs Outdated Show resolved Hide resolved
crates/tabby/src/services/health.rs Outdated Show resolved Hide resolved
crates/tabby/src/services/health.rs Outdated Show resolved Hide resolved
ee/tabby-ui/app/(dashboard)/components/worker-card.tsx Outdated Show resolved Hide resolved
ee/tabby-ui/app/(dashboard)/page.tsx Outdated Show resolved Hide resolved
Comment on lines +13 to +14
/// Universally unique ID of the accelerator, if available
pub uuid: Option<String>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest issue now is whether we can prevent this field from being part of the telemetry. If not, it definitely has to be removed, as it's basically almost equivalent to a serial number. It's nice for tracking a GPU in the accelerators, but it's a huge issue if that data leaves your control.

@cromefire
Copy link
Contributor Author

cromefire commented Nov 28, 2023

As for the why I use rocminfo instead of a lib that there's no rust wrapper for libhsa-runtime64 and I tried wring my own quickly, but due to C being C it seems to create just more problems than it solves (and my rust ffi and C skills aren't great). And then you have to link it again, which is rather fragile or load in on runtime, which is more of a guessing game, while rocminfo should be available whenever rocm is installed and is just on the path and just works. And if not, worst case, it doesn't show the GPUs in the health info. Maybe long term, at least on Linux, some sort of generic approach using the kernel interfaces is possible (via DRM or so).

@wsxiaoys
Copy link
Member

Hey, considering the increasing size of this pull request, I have extracted #913 to focus on the basic ROCm support as a preliminary step

@wsxiaoys wsxiaoys mentioned this pull request Nov 29, 2023
@Amzd
Copy link

Amzd commented Dec 8, 2023

The other PR didn't add any doc but I tried the command in documentation of this PR but couldn't get it to work, is there a trick to it I'm missing?

docker run -it
--device /dev/dri --device /dev/kfd
-p 8080:8080 -v $HOME/.tabby:/data
tabbyml/tabby-rocm
serve --model TabbyML/StarCoder-1B --device rocm

@cromefire
Copy link
Contributor Author

The other PR didn't add any doc but I tried the command in documentation of this PR but couldn't get it to work, is there a trick to it I'm missing?

docker run -it
--device /dev/dri --device /dev/kfd
-p 8080:8080 -v $HOME/.tabby:/data
tabbyml/tabby-rocm
serve --model TabbyML/StarCoder-1B --device rocm

The other PR didn't include the docker image, you'll have to build it yourself

@Frozen-byte
Copy link

I am sadly unable to run this docker:

$ docker-compose up 
Recreating tabby_tabby_1 ... done
Attaching to tabby_tabby_1
tabby_1  | 2023-12-09T17:02:11.296325Z  INFO tabby::serve: crates/tabby/src/serve.rs:105: Starting server, this might takes a few minutes...
tabby_1  | ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
tabby_1  | ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
tabby_1  | ggml_init_cublas: found 2 ROCm devices:
tabby_1  |   Device 0: AMD Radeon RX 6900 XT, compute capability 10.3
tabby_1  |   Device 1: AMD Radeon Graphics, compute capability 10.3
tabby_1  | ggml_cuda_set_main_device: using device 0 (AMD Radeon RX 6900 XT) as main device
tabby_1  | 
tabby_1  | CUDA error 98 at /root/workspace/crates/llama-cpp-bindings/llama.cpp/ggml-cuda.cu:6779: invalid device function
tabby_1  | current device: 0
tabby_tabby_1 exited with code 1

my docker-compose.yml:

version: '3.5'

services:
  tabby:
    build:
      context: .
      dockerfile: rocm.Dockerfile
    command: serve --model TabbyML/StarCoder-1B --device rocm
    volumes:
      - "$HOME/Docker/tabby/data:/data"
    ports:
      - 8229:8080
    environment:
      - TABBY_DISABLE_USAGE_COLLECTION=1
    devices:
      - /dev/dri
      - /dev/kfd

@cromefire
Copy link
Contributor Author

You'll probably need to remove your iGPU from the container, ROCm sometimes has some problems with that (alternatively I think in the llama.cpp docs it also called out an environment variable to make it only use a specific GPU).

# Conflicts:
#	crates/llama-cpp-bindings/build.rs
#	ee/tabby-ui/app/(dashboard)/cluster/components/worker-card.tsx
#	ee/tabby-ui/app/(dashboard)/page.tsx
#	ee/tabby-ui/lib/gql/generates/gql.ts
#	ee/tabby-ui/lib/gql/generates/graphql.ts
#	ee/tabby-ui/lib/gql/request-documents.tsx
#	ee/tabby-ui/lib/hooks/use-workers.ts
#	ee/tabby-webserver/src/api.rs
#	ee/tabby-webserver/src/lib.rs
#	ee/tabby-webserver/src/service/worker.rs
#	ee/tabby-webserver/ui/404.html
#	ee/tabby-webserver/ui/_next/static/KwTUFhDElVXkDE3jiCtD0/_buildManifest.js
#	ee/tabby-webserver/ui/_next/static/KwTUFhDElVXkDE3jiCtD0/_ssgManifest.js
#	ee/tabby-webserver/ui/_next/static/b_TXnwhyGI26zeMbbpjKm/_buildManifest.js
#	ee/tabby-webserver/ui/_next/static/b_TXnwhyGI26zeMbbpjKm/_ssgManifest.js
#	ee/tabby-webserver/ui/_next/static/chunks/app/playground/page-178c9cbcb60d3ef7.js
#	ee/tabby-webserver/ui/_next/static/yoSiYgCA8KR8UtHATY8_5/_buildManifest.js
#	ee/tabby-webserver/ui/_next/static/yoSiYgCA8KR8UtHATY8_5/_ssgManifest.js
#	ee/tabby-webserver/ui/index.html
#	ee/tabby-webserver/ui/index.txt
#	ee/tabby-webserver/ui/playground.html
#	ee/tabby-webserver/ui/playground.txt
#	ee/tabby-webserver/ui/swagger.html
#	ee/tabby-webserver/ui/swagger.txt
#	website/docs/extensions/troubleshooting.md
#	website/docs/installation/docker-compose.mdx
#	website/docs/installation/docker.mdx
@cromefire
Copy link
Contributor Author

So updating this from main starts to get complicated now, because there's conflicts in a lot of places, so it would be great to get this in like some time soon... I'm trying to fix it up now, but I'm not sure how long I can do this, it's quite time consuming.

@Frozen-byte
Copy link

Frozen-byte commented Dec 10, 2023

You'll probably need to remove your iGPU from the container, ROCm sometimes has some problems with that (alternatively I think in the llama.cpp docs it also called out an environment variable to make it only use a specific GPU).

Thank you! I can verify that pinning devices works perfectly now (1bc6f48):

    devices:
      - /dev/dri/card0
      - /dev/dri/renderD128
      - /dev/kfd

# Conflicts:
#	ee/tabby-ui/app/(dashboard)/cluster/components/cluster.tsx
@cromefire
Copy link
Contributor Author

Also the llama.cpp fork should probably be updated sooner than later, as there's some fixes for AMD GPU performance in there.

@wsxiaoys
Copy link
Member

Hey @cromefire - I would suggest first have a PR of building docker image / binary checkin first, then we can review how to integrate to rocm device information into dashboard.

@cromefire
Copy link
Contributor Author

cromefire commented Dec 10, 2023

Sure. I'll extract that. The changes to support more accelerators are the hardest to maintain though.

PS: Also keep in mind that it won't only be ROCm device info soon, but probably oneAPI and/or Vulkan as well, which is why I made this be a more generic Accelerator interface.

cromefire added a commit to cromefire/tabby that referenced this pull request Dec 10, 2023
@cromefire cromefire changed the title feat: Add support for AMD hardware feat: Add support for telemetry and web UI display for devices other than CUDA Dec 10, 2023
wsxiaoys added a commit that referenced this pull request Dec 13, 2023
* Added rocm builds and documentation

* Pulled build improvements from #902

* Fixed build container for rocm build

* Install git in rocm container

* Fixed github step

* Try to fix if statement

* Added more generic dependency installation

* upgraded rustup action

* Update sccache

* Try pytorch manylinux image

* Switched location for toolchain parameter

* Downgraded to deprecated action again

* Readded set default step

* Install minimal rocm on the fly

* fixed typo in binary name

* Downgraded checkout action

* Use curl to download

* Add -y flag to yum

* Also install rocblas

* Update release.yml

* Update release.yml

* Update prepare_build_environment.sh

* Update prepare_build_environment.sh

* Update build.rs

* Update build.rs

* Update README.md

* Update website/docs/faq.mdx

* Update index.md

* Update and rename docker-cuda.yml to docker.yml

* Delete .github/workflows/docker-rocm.yml

* Delete rocm.Dockerfile

* Rename cuda.Dockerfile to Dockerfile

* Update docker.yml

* Update website/docs/installation/docker.mdx

* Update website/docs/installation/docker-compose.mdx

* Update docker-compose.mdx

* Update docker-compose.mdx

* Update docker.mdx

* Update docker.mdx

* Update website/docs/faq.mdx

---------

Co-authored-by: Meng Zhang <[email protected]>
# Conflicts:
#	.github/workflows/release.yml
#	Cargo.lock
#	crates/llama-cpp-bindings/src/llama.rs
#	crates/tabby/src/worker.rs
#	ee/tabby-webserver/src/api.rs
#	ee/tabby-webserver/src/lib.rs
#	ee/tabby-webserver/ui/404.html
#	ee/tabby-webserver/ui/_next/static/Yg0h7wsN0w8kbqBtlggWe/_buildManifest.js
#	ee/tabby-webserver/ui/_next/static/Yg0h7wsN0w8kbqBtlggWe/_ssgManifest.js
#	ee/tabby-webserver/ui/_next/static/b_TXnwhyGI26zeMbbpjKm/_buildManifest.js
#	ee/tabby-webserver/ui/_next/static/b_TXnwhyGI26zeMbbpjKm/_ssgManifest.js
#	ee/tabby-webserver/ui/_next/static/chunks/787-9e31941ac5498659.js
#	ee/tabby-webserver/ui/_next/static/chunks/app/(dashboard)/page-178d78b0cb76c3d8.js
#	ee/tabby-webserver/ui/_next/static/chunks/app/(dashboard)/team/page-ec2e0dfcc033c8d6.js
#	ee/tabby-webserver/ui/_next/static/chunks/app/auth/signin/page-ba1372efadfb09ae.js
#	ee/tabby-webserver/ui/_next/static/chunks/app/auth/signup/page-259c5cda08cde501.js
#	ee/tabby-webserver/ui/_next/static/chunks/app/layout-da8c0f298b52b0a9.js
#	ee/tabby-webserver/ui/_next/static/chunks/app/playground/page-33c06e2d963b9bd9.js
#	ee/tabby-webserver/ui/_next/static/f19OhW5W2SP9IXexWzvSm/_buildManifest.js
#	ee/tabby-webserver/ui/_next/static/f19OhW5W2SP9IXexWzvSm/_ssgManifest.js
#	ee/tabby-webserver/ui/api.html
#	ee/tabby-webserver/ui/api.txt
#	ee/tabby-webserver/ui/auth/signin.html
#	ee/tabby-webserver/ui/auth/signin.txt
#	ee/tabby-webserver/ui/auth/signup.html
#	ee/tabby-webserver/ui/auth/signup.txt
#	ee/tabby-webserver/ui/cluster.html
#	ee/tabby-webserver/ui/cluster.txt
#	ee/tabby-webserver/ui/index.html
#	ee/tabby-webserver/ui/index.txt
#	ee/tabby-webserver/ui/playground.html
#	ee/tabby-webserver/ui/playground.txt
#	ee/tabby-webserver/ui/team.html
#	ee/tabby-webserver/ui/team.txt
#	website/docs/faq.mdx
@cromefire
Copy link
Contributor Author

Just removed the conflicts and maybe another friendly reminder that these changes suck up a lot of time the longer they lie around and I constantly have to resolve the conflicts...

# Conflicts:
#	.github/workflows/release.yml
#	ee/tabby-webserver/graphql/schema.graphql
#	ee/tabby-webserver/src/lib.rs
#	ee/tabby-webserver/ui/404.html
#	ee/tabby-webserver/ui/_next/static/DR2uz83UxeFune2UqDmYc/_buildManifest.js
#	ee/tabby-webserver/ui/_next/static/DR2uz83UxeFune2UqDmYc/_ssgManifest.js
#	ee/tabby-webserver/ui/_next/static/IyjC1gd1mIwIqFgfamAQE/_buildManifest.js
#	ee/tabby-webserver/ui/_next/static/IyjC1gd1mIwIqFgfamAQE/_ssgManifest.js
#	ee/tabby-webserver/ui/_next/static/chunks/320-558d0cf728abd693.js
#	ee/tabby-webserver/ui/_next/static/chunks/358-42f7d20a1d08acdf.js
#	ee/tabby-webserver/ui/_next/static/chunks/787-7044efff94e13a32.js
#	ee/tabby-webserver/ui/_next/static/chunks/936-c5db2bc813a2dffc.js
#	ee/tabby-webserver/ui/_next/static/chunks/app/(dashboard)/page-dcf4098462fb2820.js
#	ee/tabby-webserver/ui/_next/static/chunks/app/(dashboard)/team/page-9f81e0759cc05ccf.js
#	ee/tabby-webserver/ui/_next/static/chunks/app/auth/signup/page-f91f45006a9deb29.js
#	ee/tabby-webserver/ui/_next/static/chunks/app/layout-6661c62a6a0742fa.js
#	ee/tabby-webserver/ui/_next/static/chunks/webpack-1456af27b04dd17f.js
#	ee/tabby-webserver/ui/_next/static/chunks/webpack-3e396986a653b40a.js
#	ee/tabby-webserver/ui/_next/static/chunks/webpack-543c81bccbb8217e.js
#	ee/tabby-webserver/ui/_next/static/css/40752a0bacee63c0.css
#	ee/tabby-webserver/ui/_next/static/css/724bf7d939e05f2e.css
#	ee/tabby-webserver/ui/_next/static/css/98dd5f72c9a55366.css
#	ee/tabby-webserver/ui/_next/static/f19OhW5W2SP9IXexWzvSm/_buildManifest.js
#	ee/tabby-webserver/ui/_next/static/f19OhW5W2SP9IXexWzvSm/_ssgManifest.js
#	ee/tabby-webserver/ui/api.html
#	ee/tabby-webserver/ui/api.txt
#	ee/tabby-webserver/ui/auth/signin.html
#	ee/tabby-webserver/ui/auth/signin.txt
#	ee/tabby-webserver/ui/auth/signup.html
#	ee/tabby-webserver/ui/auth/signup.txt
#	ee/tabby-webserver/ui/cluster.html
#	ee/tabby-webserver/ui/cluster.txt
#	ee/tabby-webserver/ui/index.html
#	ee/tabby-webserver/ui/index.txt
#	ee/tabby-webserver/ui/playground.html
#	ee/tabby-webserver/ui/playground.txt
#	ee/tabby-webserver/ui/team.html
#	ee/tabby-webserver/ui/team.txt
#	website/docs/faq.mdx
@cromefire cromefire changed the title feat: Add support for telemetry and web UI display for devices other than CUDA feat: Add support for devices other than CUDA in telemetry and web UI Jan 2, 2024
@cromefire cromefire changed the title feat: Add support for devices other than CUDA in telemetry and web UI feat: Add Support for Devices other than CUDA in Telemetry and Web UI Jan 2, 2024
@JumpLink
Copy link

JumpLink commented Jan 5, 2024

super nice, when will the PR be merged?

@wsxiaoys wsxiaoys closed this Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AMD GPUs (ROCm) support
5 participants