-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs for cache behaviour #342
Comments
This is how caching is actually implemented with OpenAI. However, with To enable caching across multiple calls of the same request, make sure to pass a |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'm trying to understand how/when LLM calls get cached, especially when using the OpenAI API.
I've looked in the docs, but can't find details.
Ideally, in development, I'd like to be able to cache/memoize calls to the API. For example, if one uses a LMQL programe which requests multiple completions, and changes the later part of the programme but leave the early phase unchanged. In this case it seems like the early requests to the API could be cached? This is especially the case if passing a
seed
which is now supported by the API.The text was updated successfully, but these errors were encountered: