-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for /health
and /info
endpoints for TGI
#1819
Comments
let's see if there's more user demand maybe |
@Wauplin , this came up in a conversation with a customer recently and I think it would be great to support his. if I understand correctly, via /info users of serverless API endpoints could check which TGI version/sha an LLM like llama-3/dbrx/command-r is running on. That's quite important for debugging and understanding which recent features of TGI are supported (e.g. if guidance/function calling finally works or if the model runs on an old TGI version which is incompatible). At the moment, I'm trying to get guidance to work for llama-3 for example and I'm not sure if users can know which TGI version it is running with. Based on this internal conversation, the TGI version/sha is only available in a private HF repo. would be great to enable users to query this information via our library. |
Another example for why it's very useful for users to know the exact TGI version an endpoint is running on: (internal conversation) |
Originally from @thomwolf on slack (private)
Docs:
/health
/info
Thought I'm not sure yet how/where to integrate those in
InferenceClient
. Fow now, let's see if there's more demand for it (=> anyone landing on this issue and that is interested, please let us know!)The text was updated successfully, but these errors were encountered: