Skip to content

Commit

Permalink
update serving runtime table
Browse files Browse the repository at this point in the history
  • Loading branch information
alexagriffith committed Jan 10, 2023
1 parent 9b8996b commit 47b7399
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 14 deletions.
7 changes: 4 additions & 3 deletions docs/modelserving/data_plane/v2_protocol.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
## Open Inference Protocol (V2 Inference Protocol)

**For an inference server to be compliant with this protocol the server must implement the health, metadata, and inference V2 APIs**. Optional features that are explicitly noted are not required. A compliant inference server may choose to implement the [HTTP/REST API](#httprest) and/or the [GRPC API](#grpc).
**For an inference server to be compliant with this protocol the server must implement the health, metadata, and inference V2 APIs**.
Optional features that are explicitly noted are not required. A compliant inference server may choose to implement the [HTTP/REST API](#httprest) and/or the [GRPC API](#grpc).

The V2 protocol supports an extension mechanism as a required part of the API, but this document does not propose any specific extensions. Any specific extensions will be proposed separately.
Check the [model serving runtime table](../v1beta1/serving_runtime.md) / the `protocolVersion` field in the [runtime YAML](https://github.com/kserve/kserve/tree/master/config/runtimes) to ensure V2 protocol is supported for model serving runtime that you are using.

Note: For all API descriptions on this page, all strings in all contexts are case-sensitive.
Note: For all API descriptions on this page, all strings in all contexts are case-sensitive. The V2 protocol supports an extension mechanism as a required part of the API, but this document does not propose any specific extensions. Any specific extensions will be proposed separately.

### Note on changes between V1 & V2

Expand Down
34 changes: 23 additions & 11 deletions docs/modelserving/v1beta1/serving_runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,30 @@ After models are deployed with InferenceService, you get all the following serve
- Out-of-the-box metrics
- Ingress/Egress control

| Model Serving Runtime | Exported model| Prediction Protocol | HTTP | gRPC | Versions | Examples |

---

The table below identifies each of the model serving runtimes supported by KServe. The HTTP and gRPC columns indicate the prediction protocol version that the serving runtime supports. The KServe prediction protocol is noted as either "v1" or "v2". Some serving runtimes also support their own prediction protocol, these are noted with an `*`. The default serving runtime version column defines the source and version of the serving runtime - MLServer, KServe or its own. These versions can also be found in the [runtime kustomization YAML](https://github.com/alexagriffith/kserve/blob/master/config/runtimes/kustomization.yaml). All KServe native model serving runtimes use the current KServe release version (v0.10). The supported framework version column lists the **major** version of the model that is supported. These can also be found in the respective [runtime YAML](https://github.com/alexagriffith/kserve/tree/master/config/runtimes) under the `supportedModelFormats` field. For model frameworks using the KServe serving runtime, the specific default version can be found in [kserve/python](https://github.com/alexagriffith/kserve/tree/master/python). In a given serving runtime directory the setup.py file contains the exact model framework version used. For example, in [kserve/python/lgbserver](https://github.com/alexagriffith/kserve/tree/master/python/lgbserver) the [setup.py](https://github.com/alexagriffith/kserve/blob/master/python/lgbserver/setup.py) file sets the model framework version to 3.3.2, `lightgbm == 3.3.2`.

| Model Serving Runtime | Exported model | HTTP | gRPC | Default Serving Runtime Version | Supported Framework (Major) Version(s) | Examples |
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |--------------------------------------|
| [Triton Inference Server](https://github.com/triton-inference-server/server) | [TensorFlow,TorchScript,ONNX](https://github.com/triton-inference-server/server/blob/r21.09/docs/model_repository.md)| v2 | :heavy_check_mark: | :heavy_check_mark: | [Compatibility Matrix](https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html)| [Torchscript cifar](triton/torchscript) |
| [TFServing](https://www.tensorflow.org/tfx/guide/serving) | [TensorFlow SavedModel](https://www.tensorflow.org/guide/saved_model) | v1 | :heavy_check_mark: | :heavy_check_mark: | [TFServing Versions](https://github.com/tensorflow/serving/releases) | [TensorFlow flower](./tensorflow) |
| [TorchServe](https://pytorch.org/serve/server.html) | [Eager Model/TorchScript](https://pytorch.org/docs/master/generated/torch.save.html) | v1/v2 REST | :heavy_check_mark: | :heavy_check_mark: | 0.5.3 | [TorchServe mnist](./torchserve) |
| [SKLearn MLServer](https://github.com/SeldonIO/MLServer) | [Pickled Model](https://scikit-learn.org/stable/modules/model_persistence.html) | v2 | :heavy_check_mark: | :heavy_check_mark: | 1.0.1 | [SKLearn Iris V2](./sklearn/v2) |
| [XGBoost MLServer](https://github.com/SeldonIO/MLServer) | [Saved Model](https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html) | v2 | :heavy_check_mark: | :heavy_check_mark: | 1.5.0 | [XGBoost Iris V2](./xgboost) |
| [SKLearn ModelServer](https://github.com/kserve/kserve/tree/master/python/sklearnserver) | [Pickled Model](https://scikit-learn.org/stable/modules/model_persistence.html) | v1 | :heavy_check_mark: | -- | 1.0.1 | [SKLearn Iris](./sklearn/v2) |
| [XGBoost ModelServer](https://github.com/kserve/kserve/tree/master/python/xgbserver) | [Saved Model](https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html) | v1 | :heavy_check_mark: | -- | 1.5.0 | [XGBoost Iris](./xgboost) |
| [PMML ModelServer](https://github.com/kserve/kserve/tree/master/python/pmmlserver) | [PMML](http://dmg.org/pmml/v4-4-1/GeneralStructure.html) | v1 | :heavy_check_mark: | -- | [PMML4.4.1](https://github.com/autodeployai/pypmml) | [SKLearn PMML](./pmml) |
| [LightGBM ModelServer](https://github.com/kserve/kserve/tree/master/python/lightgbm) | [Saved LightGBM Model](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.Booster.html#lightgbm.Booster.save_model) | v1 | :heavy_check_mark: | -- | 3.2.0 | [LightGBM Iris](./lightgbm) |
| [Custom ModelServer](https://github.com/kserve/kserve/tree/master/python/kserve/kserve) | -- | v1 | :heavy_check_mark: | -- | -- | [Custom Model](custom/custom_model) |
| [Custom ModelServer](https://github.com/kserve/kserve/tree/master/python/kserve/kserve) | -- | v1, v2 | v2 | -- | -- | [Custom Model](custom/custom_model) |
| [LightGBM MLServer](https://mlserver.readthedocs.io/en/latest/runtimes/lightgbm.html) | [Saved LightGBM Model](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.Booster.html#lightgbm.Booster.save_model) | v2 | v2 | v1.0.0 (MLServer) | 3 | [LightGBM Iris V2](./lightgbm) |
| [LightGBM ModelServer](https://github.com/kserve/kserve/tree/master/python/lgbserver) | [Saved LightGBM Model](https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.Booster.html#lightgbm.Booster.save_model) | v1 | -- | v0.10.0 (KServe) | 3 | [LightGBM Iris](./lightgbm) |
| [PMML ModelServer](https://github.com/kserve/kserve/tree/master/python/pmmlserver) | [PMML](http://dmg.org/pmml/v4-4-1/GeneralStructure.html) | v1 | -- | v0.10.0 (KServe) | 3, 4 ([PMML4.4.1](https://github.com/autodeployai/pypmml)) | [SKLearn PMML](./pmml) |
| [SKLearn MLServer](https://github.com/SeldonIO/MLServer) | [Pickled Model](https://scikit-learn.org/stable/modules/model_persistence.html) | v2 | v2| v1.0.0 (MLServer) | 1 | [SKLearn Iris V2](./sklearn/v2) |
| [SKLearn ModelServer](https://github.com/kserve/kserve/tree/master/python/sklearnserver) | [Pickled Model](https://scikit-learn.org/stable/modules/model_persistence.html) | v1 | -- | v0.10.0 (KServe) | 1 | [SKLearn Iris](./sklearn/v2) |
| [TFServing](https://www.tensorflow.org/tfx/guide/serving) | [TensorFlow SavedModel](https://www.tensorflow.org/guide/saved_model) | v1 | *tensorflow | 2.6.2 ([TFServing Versions](https://github.com/tensorflow/serving/releases)) | 2 | [TensorFlow flower](./tensorflow) |
| [TorchServe](https://pytorch.org/serve/server.html) | [Eager Model/TorchScript](https://pytorch.org/docs/master/generated/torch.save.html) | v1, v2, *torchserve | *torchserve | 0.7.0 (TorchServe) | 1 | [TorchServe mnist](./torchserve) |
| [Triton Inference Server](https://github.com/triton-inference-server/server) | [TensorFlow,TorchScript,ONNX](https://github.com/triton-inference-server/server/blob/r21.09/docs/model_repository.md)| v2 | v2 | 21.09-py3 (Triton) | 8 (TensoRT), 1, 2 (TensorFlow), 1 (PyTorch), 2 (Triton) [Compatibility Matrix](https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html)| [Torchscript cifar](triton/torchscript) |
| [XGBoost MLServer](https://github.com/SeldonIO/MLServer) | [Saved Model](https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html) | v2 | v2 | v1.0.0 (MLServer) | 1 | [XGBoost Iris V2](./xgboost) |
| [XGBoost ModelServer](https://github.com/kserve/kserve/tree/master/python/xgbserver) | [Saved Model](https://xgboost.readthedocs.io/en/latest/tutorials/saving_model.html) | v1 | -- | v0.10.0 (KServe) | 1 | [XGBoost Iris](./xgboost) |



*tensorflow - Tensorflow implements its own prediction protocol in addition to KServe's. See: [Tensorflow Serving Prediction API](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto) documentation

*torchserve - PyTorch implements its own predicition protocol in addition to KServe's. See: [Torchserve gRPC API](https://pytorch.org/serve/grpc_api.html#) documentation

!!! Note
The model serving runtime version can be overwritten with the `runtimeVersion` field on InferenceService yaml and we highly recommend
Expand Down

0 comments on commit 47b7399

Please sign in to comment.