-
With few exceptions, most supported LLMs are 4K context window. Local LLMs at least give the opportunity for 16K, 32K and more. |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 2 replies
-
Hello @jonny7737, OpenAI provides 16K context windiw models, clause-v2 has 100K context window. Moreover, you can use any opensource HuggingFace and Ollama models with |
Beta Was this translation helpful? Give feedback.
-
I was not aware OpenAI and Claude2 could be hosted locally. I stand corrected. |
Beta Was this translation helpful? Give feedback.
-
They cant.. The only thing you are hosting locally would be the database that these models speak too, or the model from the Local LLM. |
Beta Was this translation helpful? Give feedback.
-
They don't work locally, but they do have a larger context window. You should use HuggignFace and Ollama models as local alternatives with |
Beta Was this translation helpful? Give feedback.
-
Is there a way to use local (offline) models from huggingface without the use of Ollama (currently not available for Windows OS yet)? |
Beta Was this translation helpful? Give feedback.
-
Yes @igoralvarezz it is currently supported. You can create an instance of import torch
from llama_index.llms import HuggingFaceLLM
from autollm import AutoServiceContext, AutoVectorStoreIndex, AutoQueryEngine
llm = HuggingFaceLLM(
context_window=4096,
max_new_tokens=256,
query_wrapper_prompt,
tokenizer_name="StabilityAI/stablelm-tuned-alpha-3b",
model_name="StabilityAI/stablelm-tuned-alpha-3b",
device_map="auto",
stopping_ids=[50278, 50279, 50277, 1, 0],
)
service_context = AutoServiceContext.from_defaults(llm=llm)
vector_store_index = AutoVectorStoreIndex.from_defaults()
query_engine = AutoQueryEngine.from_instances(vector_store_index. service_context) |
Beta Was this translation helpful? Give feedback.
Yes @igoralvarezz it is currently supported. You can create an instance of
HuggingFaceLLM
and use it in theautollm
pipeline as: