BentoML is an open source framework for high performance ML model serving.
- Prepare the model
$ cd yolov5
$ python bento_ml.py
- Run the BentoML service
$ bentoml serve service.py:svc
- Send a request to the service
$ curl -X POST -H "Content-Type: text/plain" --data 'SAMPLE IMG URI' http://localhost:3000/predict
Before building the docker image, make sure you have the pretrained model in the correct directory.
Also, make sure you have the bentoml
cli installed, and the bentofile.yaml
is in the same directory as the service.py
file.
$ bentoml build
Once the build is successful, you will get an output as follows:
Successfully built Bento(tag="pytorch_yolo_demo:dczcz4fppglvaiva").
Then, you can run the docker image with the following command:
bentoml containerize pytorch_yolo_demo:dczcz4fppglvaiva
# > Successfully built docker image "pytorch_yolo_demo:dczcz4fppglvaiva"
docker run --gpus all -p 3000:3000 pytorch_yolo_demo:dczcz4fppglvaiva