Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a normal llama.cpp server endpoint option. #338

Open
lastrosade opened this issue Mar 7, 2024 · 1 comment
Open

Add a normal llama.cpp server endpoint option. #338

lastrosade opened this issue Mar 7, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@lastrosade
Copy link

By adding a llama.cpp server endpoint option, we could easily just use features already present in llama.cpp without having to rely on llama-cpp-python.

The llama.cpp server supports both HIP and Vulkan on Windows.

@lbeurerkellner lbeurerkellner added the enhancement New feature or request label Mar 7, 2024
@lastrosade
Copy link
Author

lastrosade commented Mar 7, 2024

Note that the llama.cpp server endpoint is openai compatible, it would probably be sufficient to reuse the openai endpoint code without any model/API key requirements. Maybe a way to specify samplers like min_p, top_k and temp. tho, this would make it impossible to specify a prompt template and would use chatml by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants