-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect finish_reason that is not stop
#261
Comments
Also related, retry logic based on the error encountered. E.g. do we want to retry on content_filter? On hitting max tokens instead of stop token, etc. Retry (throw an exception) on None message return For retrying we can use https://tenacity.readthedocs.io/en/latest/ |
litellm can access like this: |
litellm provides a map between all the finish reasons to standard openai set Do they process all output with this? Yes, it seems so based on running the script below on the dataset of refusals and comparing to the one that using google's genai python sdk
|
So what was actually going on is RECITATION which litellm maps to content filter (in the function I linked above). You can see the RECITATION as the
|
Short term, I don't think there is actually any solution to turn off the RECITATION content filter. Others are also seeing this filter randomly on prompts that seem innocuous:
They claim to have fixed this:
But I'm still seeing the recitation blocking behave poorly. This issue lists out some other discussions / resources The official docs currently say
More discussion: People are suggestion asking for > 1 completions for every request, but that is unnecessarily expensive. We will just do a bunch of retries until we (hopefully) get a response that is not content filtered. |
It might be a good idea to add this as part of the |
OpenAI finish reason options:
|
Since anthropic response_format responds with finish reason tool_calls, we can't just retry when it is != 'stop'. Fixed in Also did a PR into litellm to update the function to latest types and make it clearer |
Gemini returns None messages when content_filter is the finish reason.
Other finish reasons like (maximum length) should be detected and dealt with. The default behavior should not be ignoring.
The text was updated successfully, but these errors were encountered: