-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Add Ollama support #1036
base: main
Are you sure you want to change the base?
Conversation
from guidance import models
ollama = models.Ollama('phi3.5')
ollama = models.Ollama('phi3.5:3.8b')
ollama = models.Ollama('phi3.5:latest')
... |
Hi @microdev1, this is a great start on Ollama support in Guidance. Thanks for your contribution! I tested the Ollama class and was able to run a model using it, so I can confirm the basic functionality is working. However, there is some additional work needed regarding chat templates. Without the proper template, Guidance walks off the end of a role and continues generating text beyond the <|end|> token. Ollama stores the chat template in the modelfile, and it looks like this for phi 3:
In contrast to Ollama, Guidance uses a Jinja style template like this:
Guidance also has classes wrapping the templates like this:
Ideally when Ollama model is loaded, the proper chat template would automatically be loaded as well. If you're curious, the chat templates code is in All that being said, I think your implementation would technically work as long as someone provides the appropriate chat template string with the |
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files@@ Coverage Diff @@
## main #1036 +/- ##
==========================================
- Coverage 70.25% 61.43% -8.83%
==========================================
Files 62 63 +1
Lines 4472 4494 +22
==========================================
- Hits 3142 2761 -381
- Misses 1330 1733 +403 ☔ View full report in Codecov by Sentry. |
Adds a thin wrapper for models pulled using Ollama. Gets the local model path using the provided model name, and then instantiates the
LlamaCppEngine
with it and other args.