-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Budget Management - Limit token sizes #40
Comments
Hi @kmesiab I'm the maintainer of LiteLLM https://github.com/BerriAI/litellm we allow you to do cost tracking for 100+ LLMs UsageDocs: https://docs.litellm.ai/docs/#calculate-costs-usage-latency from litellm import completion, completion_cost
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
response = completion(
model="gpt-3.5-turbo",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)
cost = completion_cost(completion_response=response)
print("Cost for completion call with gpt-3.5-turbo: ", f"${float(cost):.10f}") We also allow you to create a self hosted OpenAI Compatible proxy server to make your LLM calls (100+ LLMs), track costs, token usage I hope this is helpful, if not I'd love your feedback on what we can improve |
This looks great @ishaan-jaff , I will give it a go! |
Some diffs can be quite large. Sometimes those large diffs include changes that are not relevant to a code review, like dependency updates.
Sending these diffs to OpenAI will consume credits unecessarily.
Implement a relief valve and budget manager for token usage.
The ability to set a token limit.
This article from Microsoft introduces a strategy for prompt compression with no loss.
https://arxiv.org/abs/2310.05736
The text was updated successfully, but these errors were encountered: