[Feature] Add Ollama support #1036

microdev1 · 2024-09-25T06:14:17Z

Adds a thin wrapper for models pulled using Ollama. Gets the local model path using the provided model name, and then instantiates the LlamaCppEngine with it and other args.

Closes ollama support? #1001

microdev1 · 2024-09-25T06:25:21Z

from guidance import models

ollama = models.Ollama('phi3.5')
ollama = models.Ollama('phi3.5:3.8b')
ollama = models.Ollama('phi3.5:latest')

...

nking-1 · 2024-09-30T17:52:04Z

Hi @microdev1, this is a great start on Ollama support in Guidance. Thanks for your contribution!

I tested the Ollama class and was able to run a model using it, so I can confirm the basic functionality is working. However, there is some additional work needed regarding chat templates. Without the proper template, Guidance walks off the end of a role and continues generating text beyond the <|end|> token.

Ollama stores the chat template in the modelfile, and it looks like this for phi 3:

TEMPLATE "{{ if .System }}<|system|>
{{ .System }}<|end|>
{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}<|end|>
{{ end }}<|assistant|>
{{ .Response }}<|end|>"
PARAMETER stop <|end|>
PARAMETER stop <|user|>
PARAMETER stop <|assistant|>

In contrast to Ollama, Guidance uses a Jinja style template like this:

phi3_small_template = "{{ bos_token }}{% for message in messages %}{{'<|' + message['role'] + '|>' + '\n' + message['content'] + '<|end|>\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|assistant|>\n' }}{% else %}{{ eos_token }}{% endif %}"

Guidance also has classes wrapping the templates like this:

class Phi3MiniChatTemplate(ChatTemplate):
    # available_roles = ["user", "assistant"]
    template_str = phi3_mini_template

    def get_role_start(self, role_name):
        if role_name == "user":
            return "<|user|>\n"
        elif role_name == "assistant":
            return "<|assistant|>\n"
        elif role_name == "system":
            return "<|system|>\n"
        else:
            raise UnsupportedRoleException(role_name, self)

    def get_role_end(self, role_name=None):
        return "<|end|>\n"

Ideally when Ollama model is loaded, the proper chat template would automatically be loaded as well. If you're curious, the chat templates code is in guidance/chat.py We're still discussing how to improve it to support Ollama and make it easier for the community to add templates for open source models. You're welcome to make any suggestions or take a shot at implementing something for chat templates with Ollama.

All that being said, I think your implementation would technically work as long as someone provides the appropriate chat template string with the chat_template parameter in the constructor. We should be able to use this as a starting point for the next steps.

codecov-commenter · 2024-10-08T05:35:10Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 50.00000% with 11 lines in your changes missing coverage. Please review.

Project coverage is 61.43%. Comparing base (6eb08f4) to head (24bcf91).
Report is 14 commits behind head on main.

Files with missing lines	Patch %	Lines
guidance/models/_ollama.py	47.61%	11 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

❗ There is a different number of reports uploaded between BASE (6eb08f4) and HEAD (24bcf91). Click for more details.

HEAD has 56 uploads less than BASE

Flag BASE (6eb08f4) HEAD (24bcf91)

124 68

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1036      +/-   ##
==========================================
- Coverage   70.25%   61.43%   -8.83%     
==========================================
  Files          62       63       +1     
  Lines        4472     4494      +22     
==========================================
- Hits         3142     2761     -381     
- Misses       1330     1733     +403

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

add ollama support

24bcf91

microdev1 force-pushed the ollama branch from 3e73801 to 24bcf91 Compare September 25, 2024 08:05

Harsha-Nori self-requested a review September 25, 2024 21:52

xruifan mentioned this pull request Nov 11, 2024

ollama support? #1001

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add Ollama support #1036

[Feature] Add Ollama support #1036

microdev1 commented Sep 25, 2024

microdev1 commented Sep 25, 2024

nking-1 commented Sep 30, 2024

codecov-commenter commented Oct 8, 2024 •

edited

Loading

[Feature] Add Ollama support #1036

Are you sure you want to change the base?

[Feature] Add Ollama support #1036

Conversation

microdev1 commented Sep 25, 2024

microdev1 commented Sep 25, 2024

nking-1 commented Sep 30, 2024

codecov-commenter commented Oct 8, 2024 • edited Loading

Codecov Report

codecov-commenter commented Oct 8, 2024 •

edited

Loading