vLLM can do this. #114
Closed
entrepeneur4lyf
started this conversation in
Results
Replies: 1 comment
-
Thanks for sharing this! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
You may want to look at vLLM for inspiration since it supports distributed GPU inference.
Beta Was this translation helpful? Give feedback.
All reactions