You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Starting recently I keep getting rate limit errors when using models like Gemini flash 2.0 even though I should be below the rate limit based on the number of requests I'm initiating. Previously this was working fine. I am using LiteLLM via PaperQA. There seems to also be an async issue but that was not previously causing a rate limit error but not sure if that's related. I tried a number of ways to avoid hitting the rate limit but so far none have worked so any assistance with this would be greatly appreciated.
What happened?
Starting recently I keep getting rate limit errors when using models like Gemini flash 2.0 even though I should be below the rate limit based on the number of requests I'm initiating. Previously this was working fine. I am using LiteLLM via PaperQA. There seems to also be an async issue but that was not previously causing a rate limit error but not sure if that's related. I tried a number of ways to avoid hitting the rate limit but so far none have worked so any assistance with this would be greatly appreciated.
https://github.com/Future-House/paper-qa
I've also seen this message when using gpt-4o but it still works without an issue then
AFC is enabled with max remote calls: 10.
Below is how I am setting up the Settings object which then throws the rate limit error.
Relevant log output
Are you a ML Ops Team?
Yes
What LiteLLM version are you on ?
1.45.0
Twitter / LinkedIn details
No response
The text was updated successfully, but these errors were encountered: