-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Support budget/rate limit tiers for keys (#7429)
* feat(proxy/utils.py): get associated litellm budget from db in combined_view for key allows user to create rate limit tiers and associate those to keys * feat(proxy/_types.py): update the value of key-level tpm/rpm/model max budget metrics with the associated budget table values if set allows rate limit tiers to be easily applied to keys * docs(rate_limit_tiers.md): add doc on setting rate limit / budget tiers make feature discoverable * feat(key_management_endpoints.py): return litellm_budget_table value in key generate make it easy for user to know associated budget on key creation * fix(key_management_endpoints.py): document 'budget_id' param in `/key/generate` * docs(key_management_endpoints.py): document budget_id usage * refactor(budget_management_endpoints.py): refactor budget endpoints into separate file - makes it easier to run documentation testing against it * docs(test_api_docs.py): add budget endpoints to ci/cd doc test + add missing param info to docs * fix(customer_endpoints.py): use new pydantic obj name * docs(user_management_heirarchy.md): add simple doc explaining teams/keys/org/users on litellm * Litellm dev 12 26 2024 p2 (#7432) * (Feat) Add logging for `POST v1/fine_tuning/jobs` (#7426) * init commit ft jobs logging * add ft logging * add logging for FineTuningJob * simple FT Job create test * (docs) - show all supported Azure OpenAI endpoints in overview (#7428) * azure batches * update doc * docs azure endpoints * docs endpoints on azure * docs azure batches api * docs azure batches api * fix(key_management_endpoints.py): fix key update to actually work * test(test_key_management.py): add e2e test asserting ui key update call works * fix: proxy/_types - fix linting erros * test: update test --------- Co-authored-by: Ishaan Jaff <[email protected]> * fix: test * fix(parallel_request_limiter.py): enforce tpm/rpm limits on key from tiers * fix: fix linting errors * test: fix test * fix: remove unused import * test: update test * docs(customer_endpoints.py): document new model_max_budget param * test: specify unique key alias * docs(budget_management_endpoints.py): document new model_max_budget param * test: fix test * test: fix tests --------- Co-authored-by: Ishaan Jaff <[email protected]>
- Loading branch information
1 parent
12c4e7e
commit 539f166
Showing
25 changed files
with
761 additions
and
373 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
# ✨ Budget / Rate Limit Tiers | ||
|
||
Create tiers with different budgets and rate limits. Making it easy to manage different users and their usage. | ||
|
||
:::info | ||
|
||
This is a LiteLLM Enterprise feature. | ||
|
||
Get a 7 day free trial + get in touch [here](https://litellm.ai/#trial). | ||
|
||
See pricing [here](https://litellm.ai/#pricing). | ||
|
||
::: | ||
|
||
|
||
## 1. Create a budget | ||
|
||
```bash | ||
curl -L -X POST 'http://0.0.0.0:4000/budget/new' \ | ||
-H 'Authorization: Bearer sk-1234' \ | ||
-H 'Content-Type: application/json' \ | ||
-d '{ | ||
"budget_id": "my-test-tier", | ||
"rpm_limit": 0 | ||
}' | ||
``` | ||
|
||
## 2. Assign budget to a key | ||
|
||
```bash | ||
curl -L -X POST 'http://0.0.0.0:4000/key/generate' \ | ||
-H 'Authorization: Bearer sk-1234' \ | ||
-H 'Content-Type: application/json' \ | ||
-d '{ | ||
"budget_id": "my-test-tier" | ||
}' | ||
``` | ||
|
||
Expected Response: | ||
|
||
```json | ||
{ | ||
"key": "sk-...", | ||
"budget_id": "my-test-tier", | ||
"litellm_budget_table": { | ||
"budget_id": "my-test-tier", | ||
"rpm_limit": 0 | ||
} | ||
} | ||
``` | ||
|
||
## 3. Check if budget is enforced on key | ||
|
||
```bash | ||
curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \ | ||
-H 'Content-Type: application/json' \ | ||
-H 'Authorization: Bearer sk-...' \ # 👈 KEY from step 2. | ||
-d '{ | ||
"model": "<REPLACE_WITH_MODEL_NAME_FROM_CONFIG.YAML>", | ||
"messages": [ | ||
{"role": "user", "content": "hi my email is ishaan"} | ||
] | ||
}' | ||
``` | ||
|
||
|
||
## [API Reference](https://litellm-api.up.railway.app/#/budget%20management) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
import Image from '@theme/IdealImage'; | ||
|
||
|
||
# User Management Heirarchy | ||
|
||
<Image img={require('../../img/litellm_user_heirarchy.png')} style={{ width: '100%', maxWidth: '4000px' }} /> | ||
|
||
LiteLLM supports a heirarchy of users, teams, organizations, and budgets. | ||
|
||
- Organizations can have multiple teams. [API Reference](https://litellm-api.up.railway.app/#/organization%20management) | ||
- Teams can have multiple users. [API Reference](https://litellm-api.up.railway.app/#/team%20management) | ||
- Users can have multiple keys. [API Reference](https://litellm-api.up.railway.app/#/budget%20management) | ||
- Keys can belong to either a team or a user. [API Reference](https://litellm-api.up.railway.app/#/end-user%20management) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.