Skip to content

Pull requests: ggerganov/llama.cpp

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

cmake : fix typo in AMDGPU_TARGETS
#7356 opened May 17, 2024 by Engininja2 Loading…
examples: cache hf model when --model not provided
#7353 opened May 17, 2024 by amirzia Loading…
CUDA: deduplicate FlashAttention code
#7352 opened May 17, 2024 by JohannesGaessler Loading…
Add StableLM2 pre-tokenizer
#7349 opened May 17, 2024 by aahouzi Loading…
server: add test for token probs
#7347 opened May 17, 2024 by JohannesGaessler Loading…
Another threadpool: Avoid creating hundreds of threads in GGML performance Speed related topics review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7342 opened May 17, 2024 by besnardjb Loading…
android : use "ci-android" branch for CI android Issues specific to Android devops improvements to build systems and github actions review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7341 opened May 17, 2024 by ggerganov Loading…
github-actions-labeler: initial commit devops improvements to build systems and github actions review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7330 opened May 16, 2024 by mofosyne Loading…
add Viking tokenizer support model Model specific python python script changes review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7329 opened May 16, 2024 by jonabur Loading…
Viking-7B tokenizer support model Model specific python python script changes review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7328 opened May 16, 2024 by akx Draft
Fixed painfully slow single process builds. build Compilation issues need feedback Testing and feedback with results are needed performance Speed related topics
#7326 opened May 16, 2024 by jboero Loading…
Add support for larger Granite Code Models (20B, 34B) model Model specific review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7324 opened May 16, 2024 by sroecker Loading…
[SYCL] Update SYCL upscale operation generation quality Quality of model output review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#7321 opened May 16, 2024 by AidanBeltonS Loading…
sched : support async weight copy performance Speed related topics review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7315 opened May 15, 2024 by slaren Draft
ggml : fix quants nans when all the group weights are very close to zero bugfix fixes an issue or bug review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7313 opened May 15, 2024 by slaren Loading…
Add phi-2 tokenizer model Model specific review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7300 opened May 15, 2024 by BramVanroy Loading…
Capture CUDA logging output enhancement New feature or request Nvidia GPU Issues specific to Nvidia GPUs review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7298 opened May 15, 2024 by fraxy-v Loading…
avoid to get prompt in infill mode and embedding mode examples review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix server
#7286 opened May 14, 2024 by woodx9 Draft
common: free ctx_gguf when exiting llama_control_vector_load_one bugfix fixes an issue or bug review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7285 opened May 14, 2024 by stevegrubb Loading…
ggml-opencl, llama: using reserve() if count already known refactoring Refactoring review complexity : high Generally require indepth knowledge of LLMs or GPUs
#7272 opened May 14, 2024 by GermanAizek Draft
common, ngram_cache: added const reference for std::pair<> and std::tuple<> more 16 bytes: refactoring Refactoring review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7270 opened May 14, 2024 by GermanAizek Draft
ggml, ngram-cache, log: added const and const ref for function params refactoring Refactoring review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7269 opened May 14, 2024 by GermanAizek Loading…
ggml llama: align structs for memory optimization on 64-bit platforms refactoring Refactoring review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7267 opened May 13, 2024 by GermanAizek Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.