You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently support LLMProviders, such as OpenAI, Mistral, Azure, and others has its own specific configurations and implementations. For instance, in the case of OpenAI, the model details are defined in the request body. Additionally, rate-limiting parameters, such as prompt tokens, completion tokens, and total tokens, are also in body.
Current approach involves handling these provider-specific configurations and logic (e.g., transformations) inside translators and external processors (extproc). While this approach works, it tightly couples the implementation to predefined providers and may limit flexibility for supporting custom LLMs.
To make the implementation more flexible and extensible, Can we explore generalizing the approach by storing provider-specific and rate-limiting information directly within the LLMRoute resource? Then will it become LLM Provider rather than LLM Route?
Key Considerations could be
Generalization: LLMRoute that supports diverse providers(Different Models) and rate-limiting use cases while remaining extensible?
Rate-Limiting: Should LLMRoute include fields for prompt tokens, completion tokens, and total tokens, or should these remain provider-specific?
Looking forward to thoughts and suggestions.
The text was updated successfully, but these errors were encountered:
Currently support LLMProviders, such as OpenAI, Mistral, Azure, and others has its own specific configurations and implementations. For instance, in the case of OpenAI, the model details are defined in the request body. Additionally, rate-limiting parameters, such as prompt tokens, completion tokens, and total tokens, are also in body.
Current approach involves handling these provider-specific configurations and logic (e.g., transformations) inside translators and external processors (extproc). While this approach works, it tightly couples the implementation to predefined providers and may limit flexibility for supporting custom LLMs.
To make the implementation more flexible and extensible, Can we explore generalizing the approach by storing provider-specific and rate-limiting information directly within the LLMRoute resource? Then will it become LLM Provider rather than LLM Route?
Key Considerations could be
Generalization: LLMRoute that supports diverse providers(Different Models) and rate-limiting use cases while remaining extensible?
Rate-Limiting: Should LLMRoute include fields for prompt tokens, completion tokens, and total tokens, or should these remain provider-specific?
Looking forward to thoughts and suggestions.
The text was updated successfully, but these errors were encountered: