CUDA is really slow compared to kobold.cpp I compiled from AUR, same backend #87

Arisa-Snowbell · 2024-05-08T09:45:25Z

All other backends run at speed I expect but CUDA is so slow its like its running on oldest Pentium, maybe wrong compile flags or something?

Arisa-Snowbell · 2024-05-25T03:44:38Z

nvcc --forward-unknown-to-host-compiler -use_fast_math -Wno-deprecated-gpu-targets -arch=all -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_MMQ_Y=64 -I. -I./common -I./include -I./include/CL -I./otherarch -I./otherarch/tools -I./otherarch/sdcpp -I./otherarch/sdcpp/thirdparty -I./include/vulkan -O3 -DNDEBUG -std=c++11 -fPIC -DLOG_DISABLE_LOGS -D_GNU_SOURCE -DGGML_USE_LLAMAFILE -pthread -s -Wno-multichar -Wno-write-strings -Wno-deprecated -Wno-deprecated-declarations -pthread -DGGML_USE_CUDA -DSD_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/opt/cuda/targets/x86_64-linux/include -Wno-pedantic -c ggml-cuda/acc.cu -o ggml-cuda/acc.o
these are the flags kobold compiles with

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA is really slow compared to kobold.cpp I compiled from AUR, same backend #87

CUDA is really slow compared to kobold.cpp I compiled from AUR, same backend #87

Arisa-Snowbell commented May 8, 2024

Arisa-Snowbell commented May 25, 2024

CUDA is really slow compared to kobold.cpp I compiled from AUR, same backend #87

CUDA is really slow compared to kobold.cpp I compiled from AUR, same backend #87

Comments

Arisa-Snowbell commented May 8, 2024

Arisa-Snowbell commented May 25, 2024