Intel Arc / XPU support #631

itlackey · 2023-10-25T02:12:32Z

I would be great to be able to run Tabby locally on my Intel Arc GPU.

Additional context

This is currently possible in tools like llama.cpp by compiling with OpenCL support. I have no idea how that would (or could) translate to Rust.

Please reply with a 👍 if you want this feature.

itlackey · 2023-10-26T14:29:02Z

If I am understanding this code correctly (crates/llama-cpp-bindings/build.rs), I believe we just need a switch for OpenCL to enable Intel GPU support.

If OpenCL is selected, then add the build args as described in this section of the llama.cpp docs:
https://github.com/ggerganov/llama.cpp#clblast

wsxiaoys · 2023-10-26T18:06:40Z

Hi @itlackey, unfortunately, I don't have an Intel Arc card to try out. If anyone has a card and is interested in giving it a try, please feel free to do so! Happy to help if any problems arise.

itlackey · 2023-10-26T18:28:24Z

I have a card but no Rust experience and basic understanding of the underlying C++ libraries. I could try adjusting the code but I'm not entirely sure of what the entire list of changes would be. Do you know if there would need additonal changes beyond altering the build args in build.rs?

wsxiaoys · 2023-10-26T18:42:13Z

I think the first step would be following the instructions here: https://github.com/TabbyML/tabby#-contributing to make it build in your local dev environment. Then you could tune the building flags in llama-cpp-bindings’s build.rs a bit to make it compiles with opencl support.

itlackey · 2023-10-26T18:44:26Z

Sounds reasonable, I will give it a try as soon as I get a chance.

cromefire · 2023-11-25T18:11:59Z

Trying to get this working (more specifically Intel iGPU support and also ROCm support) and but after compiling it I just get a "501 Error: Not Implemented" (in Docker). No errors during build though. Any idea what went wrong?

I won't be using OpenCL support, but using Intel MKL and hipBLAS, as they seem the better fit. Pretty sure there are still issues but if I can't even hit it, I can't test it.

cromefire · 2023-11-25T20:34:09Z

Put it all in a pull request here: #895

itlackey · 2023-11-25T20:48:34Z

Nice work!! I put this on the back burner due to not being able to get decent performance using OpenCL with llama.cpp but it looks like you found a better approach. Thanks for pushing this forward!

cromefire · 2023-11-25T21:27:08Z

Well I still have to get it work, right now every sort of configuration for tabby just returns HTTP 501

itlackey · 2023-11-25T21:52:47Z

Does llama.cpp work in the gpu with these compiler options? If not, get llama.cpp working as expected and then port that to the Tabby build settings. Skimming through the changes to Tabby, it seems like you're in the right track.

hungle-i3 · 2024-02-08T05:19:52Z

Hello everyone, I am working on supporting intel CPU Arch by integrating intel openapi platform to tabby. Just wanted to know whether there is any update on this. I love to contribute to this for getting it done. Thanks.

cromefire · 2024-02-08T08:20:18Z

Upstream (llama.cpp) has to support it. As soon as it has that I have stuff already prepared. Alternatively if it'll be integrated faster, Vulkan compute can also be used, but the same deal, it has to be merged first.

Haven't followed those PRs though, so if one of them has merged, tell me and I'll get it done. As soon as Tabby's fork of llama.cpp has been updated of course.

itlackey · 2024-02-08T08:20:21Z

I have not spent time on it in a while. I did see SYCL is now supported in llama.cpp and works well on Intel.

cromefire · 2024-02-08T08:23:29Z

I have not spent time on it in a while. I did see SYCL is now supported in llama.cpp and works well on Intel.

Then I should probably get to it once I find a slither of time, I have SYCL already prepared pretty much.

@wsxiaoys has the llama.cpp fork already been updated?

wsxiaoys · 2024-02-08T08:24:25Z

It's updated in recent release (0.8): https://github.com/TabbyML/llama.cpp

hungle-i3 · 2024-02-17T12:32:37Z

Hi @wsxiaoys, llama.cpp fork binding with the tabby releases (0.8/0.9) hasn't updated to the SYCL support.
At the moment, for intel architecture, I am planning to support 2 following features as recommendation from llama.cpp https://github.com/ggerganov/llama.cpp/blob/master/README-sycl.md

oneapi (onemkl): For intel CPU https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html#gs.4lpxoo.
opencl (SYCL): Backend for intel GPU: Waiting for llama.cpp fork to be upgraded to leverage SYCL support from upstream.

Is it good to go?

cromefire · 2024-02-17T12:35:10Z

onemkl: For intel CPU https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html#gs.4lpxoo.

Not sure whether that's even worth it as I think the default CPU stuff already uses AVX and so on I think.

hungle-i3 · 2024-02-17T13:09:30Z

@cromefire , will enable Intel BLAS (Intel10_64lp) at onemkl feature as Intel guideline https://www.intel.com/content/www/us/en/content-details/791610/optimizing-and-running-llama2-on-intel-cpu.html

cromefire · 2024-02-17T13:12:43Z

@cromefire , will enable Intel BLAS (Intel10_64lp) at onemkl feature as Intel guideline https://www.intel.com/content/www/us/en/content-details/791610/optimizing-and-running-llama2-on-intel-cpu.html

Would definitely still suggest to try it first if it even helps with anything and if it does maybe check for regression, because otherwise it might be easier to just use it by default rather than adding it as a "backend". 2 CPU backbends would kinda be confusing...

* Support new feature: openapi * Change compiler to Intel llvm when compiling llama.cpp * Support Intel BLAS (Intel10_64lp)

hungle-i3 · 2024-02-17T16:03:47Z

Thanks @cromefire for your suggestion.
Pull request at #1474

wsxiaoys · 2024-04-21T05:34:30Z

Closing as vulkan support is preferred for such use cases.

itlackey added the enhancement New feature or request label Oct 25, 2023

cromefire mentioned this issue Nov 25, 2023

feat: Add support Intel #895

Closed

wsxiaoys assigned hungle-i3 Feb 8, 2024

hungle-i3 added a commit to i3automation/tabby that referenced this issue Feb 17, 2024

feat: Add openapi build to support Intel Scalable CPU (TabbyML#631)

2da30b1

* Support new feature: openapi * Change compiler to Intel llvm when compiling llama.cpp * Support Intel BLAS (Intel10_64lp)

wsxiaoys mentioned this issue Mar 1, 2024

feat(tabby): Add experimental vulkan support #1588

Merged

wsxiaoys closed this as completed Apr 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intel Arc / XPU support #631

Intel Arc / XPU support #631

itlackey commented Oct 25, 2023

itlackey commented Oct 26, 2023

wsxiaoys commented Oct 26, 2023

itlackey commented Oct 26, 2023

wsxiaoys commented Oct 26, 2023

itlackey commented Oct 26, 2023

cromefire commented Nov 25, 2023 •

edited

Loading

cromefire commented Nov 25, 2023

itlackey commented Nov 25, 2023

cromefire commented Nov 25, 2023

itlackey commented Nov 25, 2023

hungle-i3 commented Feb 8, 2024

cromefire commented Feb 8, 2024 •

edited

Loading

itlackey commented Feb 8, 2024

cromefire commented Feb 8, 2024

wsxiaoys commented Feb 8, 2024

hungle-i3 commented Feb 17, 2024 •

edited

Loading

cromefire commented Feb 17, 2024

hungle-i3 commented Feb 17, 2024

cromefire commented Feb 17, 2024

hungle-i3 commented Feb 17, 2024

wsxiaoys commented Apr 21, 2024

Intel Arc / XPU support #631

Intel Arc / XPU support #631

Comments

itlackey commented Oct 25, 2023

itlackey commented Oct 26, 2023

wsxiaoys commented Oct 26, 2023

itlackey commented Oct 26, 2023

wsxiaoys commented Oct 26, 2023

itlackey commented Oct 26, 2023

cromefire commented Nov 25, 2023 • edited Loading

cromefire commented Nov 25, 2023

itlackey commented Nov 25, 2023

cromefire commented Nov 25, 2023

itlackey commented Nov 25, 2023

hungle-i3 commented Feb 8, 2024

cromefire commented Feb 8, 2024 • edited Loading

itlackey commented Feb 8, 2024

cromefire commented Feb 8, 2024

wsxiaoys commented Feb 8, 2024

hungle-i3 commented Feb 17, 2024 • edited Loading

cromefire commented Feb 17, 2024

hungle-i3 commented Feb 17, 2024

cromefire commented Feb 17, 2024

hungle-i3 commented Feb 17, 2024

wsxiaoys commented Apr 21, 2024

cromefire commented Nov 25, 2023 •

edited

Loading

cromefire commented Feb 8, 2024 •

edited

Loading

hungle-i3 commented Feb 17, 2024 •

edited

Loading