-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Usage]: what happens if served lora module is incomptaible with main model?
usage
How to use vllm
#12106
opened Jan 16, 2025 by
yxchng
1 task done
[Performance]: Question about TTFT for ngram speculative decoding
performance
Performance-related issues
#12101
opened Jan 16, 2025 by
ynwang007
1 task done
[Bug]: Discrepancies in the llama layer forward function between meta-llama, transformers and vLLM.
bug
Something isn't working
#12099
opened Jan 15, 2025 by
mcubuktepe
1 task done
[Bug]: Corrupted responses for Llama-3.2-3B-Instruct with v0.6.6.post1
bug
Something isn't working
#12096
opened Jan 15, 2025 by
bsatzger
1 task done
[RFC]: BatchLLM for better shared prefix utilizing in offline scenarios
RFC
#12080
opened Jan 15, 2025 by
xinji1
1 task done
[Usage]: Automatic Prefix Cache life cycle
usage
How to use vllm
#12077
opened Jan 15, 2025 by
hyuenmin-choi
1 task done
[Usage]: Will vLLM support LoRA for classification models?
usage
How to use vllm
#12075
opened Jan 15, 2025 by
lullabies777
1 task done
[New Model]: support minimax-01
new model
Requests to new models
#12073
opened Jan 15, 2025 by
liyawei87
1 task done
[Installation][build][docker]: rocm Dockerfile pinned to stale python torch nightly wheel builds
installation
Installation problems
#12066
opened Jan 15, 2025 by
cob-web-corner
1 task done
[Feature]: When will vllm support predicted outputs?
feature request
#12061
opened Jan 15, 2025 by
zhufeizzz
1 task done
[Bug]: Memory profiler does not consider CUDA context memory
bug
Something isn't working
#12059
opened Jan 14, 2025 by
benchislett
1 task done
[Usage]: Running Tensor Parallel on TPUs on Ray Cluster
ray
anything related with ray
usage
How to use vllm
#12058
opened Jan 14, 2025 by
BabyChouSr
1 task done
[Usage]: Issues related to model meta llama 3.1 70 b instruct
usage
How to use vllm
#12056
opened Jan 14, 2025 by
karimhussain10
1 task done
[Bug]: Drop use of pickle where possible
bug
Something isn't working
#12055
opened Jan 14, 2025 by
russellb
[New Model]: nomic-ai/nomic-embed-text-v1
new model
Requests to new models
#12054
opened Jan 14, 2025 by
Fmstrat
1 task done
[Usage]: who to run cluster withou docker
usage
How to use vllm
#12053
opened Jan 14, 2025 by
Eutenacity
1 task done
[Bug]: PaliGemma2 not working with OpenAI Docker serve
bug
Something isn't working
#12052
opened Jan 14, 2025 by
IngLP
1 task done
[Usage]: meta/lLlama-3.1-8B-Instruct with vllm class very slow in comparision to other models
usage
How to use vllm
#12047
opened Jan 14, 2025 by
JulianOestreich90
1 task done
[Misc]: For disaggregated prefill with multiple decode instances, drop_select might not enough
misc
#12039
opened Jan 14, 2025 by
spacewander
1 task done
[Usage]: Failed to serve local model in distributed inference
usage
How to use vllm
#12035
opened Jan 14, 2025 by
kerthcet
1 task done
[Bug]: Profiling on vLLM server hangs when --num-scheduler-steps > 1
bug
Something isn't working
#12032
opened Jan 14, 2025 by
Jacob0226
1 task done
[Bug]: torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 224.00 MiB. GPU 0 has a total capacity of 11.53 GiB of which 187.75 MiB is free. Including non-PyTorch memory, this process has 11.34 GiB memory in use.
bug
Something isn't working
#12030
opened Jan 14, 2025 by
bisontim
1 task done
[Bug]: another example of structured output xgrammar does not support
bug
Something isn't working
#12028
opened Jan 14, 2025 by
hustxiayang
1 task done
[Bug]: server crash when glm4-9b-chat got an image request
bug
Something isn't working
#12024
opened Jan 14, 2025 by
liuyanyi
1 task done
Previous Next
ProTip!
Adding no:label will show everything without a label.