feat: Add support Intel #895

cromefire · 2023-11-25T20:32:34Z

Realized via Intel MKL (oneAPI)

Fixes #631 (Intel)

Depends on #902

cromefire · 2023-11-25T20:59:39Z

Current state is that the Intel image builds, just returns 501 on the API call without any logged error at all.

The AMD image doesn't quite build yet, it has some sort of error in the linking stage:

25.06   = note: /usr/bin/ld: /root/workspace/target/release/build/llama-cpp-bindings-f80cf7b588122741/out/build/libllama.a(ggml-cuda.cu.o): undefined reference to symbol 'rocblas_initialize'
25.06           /usr/bin/ld: /opt/rocm-5.7.0/lib/librocblas.so.3: error adding symbols: DSO missing from command line
25.06           collect2: error: ld returned 1 exit status

I added the arg, but it doesn't seem to want to apply it:

println!("cargo:rustc-link-arg=-Wl,--copy-dt-needed-entries");

cromefire · 2023-11-25T21:55:00Z

Okay, gut the 501 under control by adding options for OneAPI and ROCm to it, now I just gotta test it (it runs at least, but not sure whether it's using the GPU) and I somehow need to get the ROCm build under control.

cromefire · 2023-11-26T00:55:44Z

So in theory the intel container should work now... but for some reason it just doesn't want to use the GPU...

wsxiaoys · 2023-11-26T01:31:27Z

So in theory the intel container should work now... but for some reason it just doesn't want to use the GPU...

is there sth like NVIDIA container toolkit need to be installed for oneAPI?

cromefire · 2023-11-26T01:33:27Z

Nope, just pass through /dev/dri (--device /dev/dri) (with that sycl-ls correctly reports the GPU, but llama.cpp doesn't seem to actually use it)

cromefire · 2023-11-26T03:09:13Z

So the ROCm image is now definitely working. TabbyML/DeepseekCoder-6.7B was just locking up the GPU entierly for some reason, but TabbyML/DeepseekCoder-1.3B runs inference in like 5ms on my RX 7900 XTX.

Edit: Big model works as well, though not as snappy (although the difference isn't too bad? That all definitely needs more investigation). If it works, it works, but especially when switching models the GPU seems like the GPU kinda hangs and I need to reboot. It's also always reported at 100% usage.

wsxiaoys · 2023-11-26T03:29:39Z

So the ROCm image is now definitely working. TabbyML/DeepseekCoder-6.7B was just locking up the GPU entierly for some reason, but TabbyML/DeepseekCoder-1.3B runs inference in like 5ms on my RX 7900 XTX.

Great! You might consider extracting ROCm as individual PR for review to get it checked in.

cromefire · 2023-11-26T03:33:27Z

Yeah I'll see tomorrow whether I can get a handle on oneAPI or whether I'll postpone that and extract ROCm, but it's 4:30 AM for me, so I really need to do that tomorrow (technically later today).

Also I really hate C and it's library linking nonsense... that cost me so much time with this....

cromefire · 2023-11-26T03:54:10Z

Also note to my future self: I need to figure out whats happening with the cuda_devices list and the Frontend and match that for ROCm and oneAPI if possible.

cromefire · 2023-11-26T11:20:44Z

@wsxiaoys also would be great I the intellj extension would be available for Rust Rover, then I could write Tabby code using Tabby. Probably just a setting or so.

cromefire · 2023-11-26T19:57:06Z

AMD stuff is "moved" to #902, because it already works pretty okay.

Regarding the Intel stuff I'm slowly getting insane, as I had it already "working", but it just doesn't want to actually offload anything to the GPU. and most of the time just doesn't reference SYCL at all. @wsxiaoys Could we get llama.cpp as a shared library or so? That sounds way easier

wsxiaoys · 2023-11-27T02:25:43Z

Could we get llama.cpp as a shared library or so? That sounds way easier

To confirm, you've been able to make llama.cpp itself work on Intel Arc, but not for tabby, correct?

icycodes · 2023-11-27T03:49:04Z

@wsxiaoys also would be great I the intellj extension would be available for Rust Rover, then I could write Tabby code using Tabby. Probably just a setting or so.

Hi, @cromefire
Thank you for this suggestion. I noticed that the latest Rust Rover preview version is v233, but the Tabby plugin's metadata currently states that it supports versions v222-232. We should update this range in the next release.
If you want to try it out before the next release, you can build the plugin locally and install it from a file. Related: #903

itlackey · 2023-12-03T19:18:33Z

AMD stuff is "moved" to #902, because it already works pretty okay.

Regarding the Intel stuff I'm slowly getting insane, as I had it already "working", but it just doesn't want to actually offload anything to the GPU. and most of the time just doesn't reference SYCL at all. @wsxiaoys Could we get llama.cpp as a shared library or so? That sounds way easier

llama.cpp currently does not use SYCL and the OpenCL implementation uses the CPU for most of the processing. I had taking a run at this a while back and got it working but insanely slow. I have sense found out this is a known issue with llama.cpp. There is currently a PR to get SYCL working correctly.

ggerganov/llama.cpp#2690

There is also a Vulkan support PR
ggerganov/llama.cpp#2059

Unfortunately without one of these being merged into llama.cpp Intel dGPUs are going to be very slow.

cromefire · 2023-12-03T19:20:52Z

llama.cpp currently does not use SYCL

Well that explains it... Well I'll update and wait then...

Vulkan of course also sounds awesome, if it's pretty close to CUDA/HIP/SYCL, because that seems like it should be the standard backend for something like TabbyML then, because it'd run everywhere.

itlackey · 2023-12-03T19:29:19Z

It would brle great! I am hoping one or the other get merged soon.

BTW here are some things I put together to test llama.cpp on Arc. The logs so the current speeds I am getting.
https://github.com/itlackey/llama.cpp-opencl

cromefire · 2023-12-03T19:32:20Z

I am hoping one or the other get merged soon.

I do think both would be good, Vulkan makes a nice and easy default backend, but SYCL might be faster.

Vulkan BTW also an easy solution for AMD on Windows.

cromefire · 2023-12-04T13:55:18Z

The logs so the current speeds I am getting.

Have you tried SYCL vs. Vulkan vs. OpenCL by any chance? (If they actually already run...) Because it sounds like OpenCL is pretty useless right now. Also how did you test, are there any benchmarks available? Would be cool for users, even of something higher level like tabby to know what works best.

itlackey · 2023-12-04T14:17:40Z

I have not, but hope to try the vulkan fork this week. It seems like that branch is more complete than the SYCL.

cromefire · 2023-12-04T14:52:56Z

I have not, but hope to try the vulkan fork this week. It seems like that branch is more complete than the SYCL.

Be sure to report (also how you did the tests), I'd really like to test Vulkan vs. ROCm on AMD as well (as ROCm doesn't work on Windows (yet)).

# Conflicts: # crates/llama-cpp-bindings/Cargo.toml # crates/llama-cpp-bindings/build.rs # crates/tabby/Cargo.toml # crates/tabby/src/main.rs

wsxiaoys · 2024-04-21T20:13:54Z

Closing as vulkan support will be released in 0.10

Added build configurations for Intel and AMD hardware

aa6bf9d

cromefire mentioned this pull request Nov 25, 2023

Intel Arc / XPU support #631

Closed

Improved rocm build

5732d7d

cromefire changed the title ~~Add support Intel and AMD hardware~~ feat: Add support Intel and AMD hardware Nov 25, 2023

Added options for OneAPI and ROCm

1e05350

Build llama using icx

3496bb9

autofix-ci bot and others added 2 commits November 26, 2023 02:49

[autofix.ci] apply automated fixes

6098475

Fixed rocm image

7d38fc4

cromefire mentioned this pull request Nov 26, 2023

AMD GPUs (ROCm) support #636

Closed

Build ROCm

4f90c98

Tried to adjust compile flags for SYCL

c0804cb

cromefire changed the title ~~feat: Add support Intel and AMD hardware~~ feat: Add support Intel Nov 26, 2023

Merge branch 'main' into additional-hardware-support

4c904ec

# Conflicts: # crates/llama-cpp-bindings/Cargo.toml # crates/llama-cpp-bindings/build.rs # crates/tabby/Cargo.toml # crates/tabby/src/main.rs

wsxiaoys closed this Apr 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support Intel #895

feat: Add support Intel #895

cromefire commented Nov 25, 2023 •

edited

Loading

cromefire commented Nov 25, 2023 •

edited

Loading

cromefire commented Nov 25, 2023

cromefire commented Nov 26, 2023

wsxiaoys commented Nov 26, 2023

cromefire commented Nov 26, 2023 •

edited

Loading

cromefire commented Nov 26, 2023 •

edited

Loading

wsxiaoys commented Nov 26, 2023

cromefire commented Nov 26, 2023 •

edited

Loading

cromefire commented Nov 26, 2023

cromefire commented Nov 26, 2023

cromefire commented Nov 26, 2023 •

edited

Loading

wsxiaoys commented Nov 27, 2023

icycodes commented Nov 27, 2023

itlackey commented Dec 3, 2023

cromefire commented Dec 3, 2023 •

edited

Loading

itlackey commented Dec 3, 2023

cromefire commented Dec 3, 2023 •

edited

Loading

cromefire commented Dec 4, 2023 •

edited

Loading

itlackey commented Dec 4, 2023

cromefire commented Dec 4, 2023 •

edited

Loading

wsxiaoys commented Apr 21, 2024

feat: Add support Intel #895

feat: Add support Intel #895

Conversation

cromefire commented Nov 25, 2023 • edited Loading

cromefire commented Nov 25, 2023 • edited Loading

cromefire commented Nov 25, 2023

cromefire commented Nov 26, 2023

wsxiaoys commented Nov 26, 2023

cromefire commented Nov 26, 2023 • edited Loading

cromefire commented Nov 26, 2023 • edited Loading

wsxiaoys commented Nov 26, 2023

cromefire commented Nov 26, 2023 • edited Loading

cromefire commented Nov 26, 2023

cromefire commented Nov 26, 2023

cromefire commented Nov 26, 2023 • edited Loading

wsxiaoys commented Nov 27, 2023

icycodes commented Nov 27, 2023

itlackey commented Dec 3, 2023

cromefire commented Dec 3, 2023 • edited Loading

itlackey commented Dec 3, 2023

cromefire commented Dec 3, 2023 • edited Loading

cromefire commented Dec 4, 2023 • edited Loading

itlackey commented Dec 4, 2023

cromefire commented Dec 4, 2023 • edited Loading

wsxiaoys commented Apr 21, 2024

cromefire commented Nov 25, 2023 •

edited

Loading

cromefire commented Nov 25, 2023 •

edited

Loading

cromefire commented Nov 26, 2023 •

edited

Loading

cromefire commented Nov 26, 2023 •

edited

Loading

cromefire commented Nov 26, 2023 •

edited

Loading

cromefire commented Nov 26, 2023 •

edited

Loading

cromefire commented Dec 3, 2023 •

edited

Loading

cromefire commented Dec 3, 2023 •

edited

Loading

cromefire commented Dec 4, 2023 •

edited

Loading

cromefire commented Dec 4, 2023 •

edited

Loading