Switch to vLLM #58

svenseeberg · 2024-10-09T07:39:45Z

Ollama has limitations, for example with available models and loading them. We may want to switch to vLLM.

svenseeberg · 2024-10-30T13:00:52Z

However, we need to find models that fit into the graphic card memory.

svenseeberg added this to the v3 Basic Answer Retrieval milestone Oct 9, 2024

svenseeberg self-assigned this Oct 9, 2024

svenseeberg added component:chat Chat Back End enhancement New feature or request labels Oct 9, 2024

svenseeberg mentioned this issue Oct 9, 2024

Evaluate Translation model performance #50

Open

svenseeberg changed the title ~~Swtich to vLLM~~ Switch to vLLM Oct 16, 2024

svenseeberg modified the milestones: v3 Basic Answer Retrieval, v3.2 Improve LLM perfomance Nov 13, 2024

Provide feedback