You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are encountering an issue with the Triton Inference Server's in-process Python API where the metrics port (default: 8002) does not open. This results in a 'connection refused' error when attempting to access localhost:8002/metrics. We would appreciate guidance on how to properly enable the metrics port using the in-process Python API.
Triton Version
2.42.0
Steps to reproduce the behavior
Initialize the Triton Inference Server using the in-process Python API with the following code snippet:
import tritonserver
# Initialize and start the Triton server
self._triton_server = tritonserver.Server(
model_repository=model_repository,
model_control_mode=tritonserver.ModelControlMode.EXPLICIT,
)
self._triton_server.start(wait_until_ready=True)
Attempt to access the metrics endpoint at localhost:8002/metrics.
Observe the 'connection refused' error.
Expected behavior
The metrics port should be accessible and provide metrics data when the Triton Inference Server is started using the in-process Python API.
Temporary Workaround
As a temporary solution, we have started an HTTP server manually to serve the metrics endpoint:
import tritonserver
import uvicorn
import threading
from fastapi import FastAPI
from starlette.responses import Response
# Initialize and start the Triton server
self._triton_server = tritonserver.Server(
model_repository=['/mount/data/models'],
model_control_mode=tritonserver.ModelControlMode.EXPLICIT
)
self._triton_server.start(wait_until_ready=True)
self._triton_server.load('clip')
self._model = self._triton_server.model('clip')
# Set up a FastAPI application to serve metrics
self.app = FastAPI()
@self.app.get("/metrics")
def get_metrics():
output = self._triton_server.metrics()
return Response(output, media_type="text/plain")
# Run the FastAPI app in a separate thread
def run():
uvicorn.run(self.app, host="0.0.0.0", port=8002)
self.server = threading.Thread(target=run)
self.server.start()
We would prefer to use the built-in functionality for serving metrics and avoid maintaining this workaround. Any suggestions or solutions would be greatly appreciated.
The text was updated successfully, but these errors were encountered:
Description
We are encountering an issue with the Triton Inference Server's in-process Python API where the metrics port (default: 8002) does not open. This results in a 'connection refused' error when attempting to access localhost:8002/metrics. We would appreciate guidance on how to properly enable the metrics port using the in-process Python API.
Triton Version
2.42.0
Steps to reproduce the behavior
Expected behavior
The metrics port should be accessible and provide metrics data when the Triton Inference Server is started using the in-process Python API.
Temporary Workaround
As a temporary solution, we have started an HTTP server manually to serve the metrics endpoint:
We would prefer to use the built-in functionality for serving metrics and avoid maintaining this workaround. Any suggestions or solutions would be greatly appreciated.
The text was updated successfully, but these errors were encountered: