Skip to content

Automatic tuning for ML model deployment on Kubernetes

License

Notifications You must be signed in to change notification settings

lwangbm/morphling

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Morphling

logo

Morphling is an auto-configuration framework for machine learning model serving (inference) on Kubernetes. Check the website for details.

Morphling paper accepted at ACM Socc 2021:
Morphling: Fast, Near-Optimal Auto-Configuration for Cloud-Native Model Serving

Overview

Morphling tunes the optimal configurations for your ML/DL model serving deployments. It searches the best container-level configurations (e.g., resource allocations and runtime parameters) by empirical trials, where a few configurations are sampled for performance evaluation.

Stack

Features

Key benefits include:

  • Automated tuning workflows hidden behind simple APIs.
  • Out of the box ML model serving stress-test clients.
  • Cloud agnostic and tested on AWS, Alicloud, etc.
  • ML framework agnostic and generally support popular frameworks, including TensorFlow, PyTorch, etc.
  • Equipped with various and customizable hyper-parameter tuning algorithms.

Getting started

Install using Yaml files

Install CRDs

From git root directory, run

kubectl apply -k config/crd/bases

Install Morphling Components

kubectl create namespace morphling-system

kubectl apply -k manifests/configmap
kubectl apply -k manifests/controllers
kubectl apply -k manifests/pv
kubectl apply -k manifests/mysql-db
kubectl apply -k manifests/db-manager
kubectl apply -k manifests/ui
kubectl apply -k manifests/algorithm

By default, Morphling will be installed under morphling-system namespace.

The official Morphling component images are hosted under docker hub.

Check if all components are running successfully:

kubectl get deployment -n morphling-system

Expected output:

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
morphling-algorithm-server   1/1     1            1           34s
morphling-controller         1/1     1            1           9m23s
morphling-db-manager         1/1     1            1           9m11s
morphling-mysql              1/1     1            1           9m15s
morphling-ui                 1/1     1            1           4m53s

Uninstall Morphling controller

bash script/undeploy.sh

Delete CRDs

kubectl get crd | grep morphling.kubedl.io | cut -d ' ' -f 1 | xargs kubectl delete crd

Install using Helm chart

Install Helm

Helm is a package manager for Kubernetes. A demo installation on MacOS:

brew install helm

Check the helm website for more details.

Install Morphling

From the root directory, run

helm install morphling ./helm/morphling --create-namespace -n morphling-system

You can override default values defined in values.yaml with --set flag. For example, set the custom cpu/memory resource:

helm install morphling ./helm/morphling --create-namespace -n morphling-system  --set resources.requests.cpu=1024m --set resources.requests.memory=2Gi

Helm will install CRDs and other Morphling components under morphling-system namespace.

Uninstall Morphling

helm uninstall morphling -n morphling-system

Delete all Morphling CRDs

kubectl get crd | grep morphling.kubedl.io | cut -d ' ' -f 1 | xargs kubectl delete crd

Morphling UI

Morphling UI is built upon Ant Design.

If you are installing Morphling with Yaml files, from the root directory, run

kubectl apply -k manifests/ui

Or if you are installing Morphling with Helm chart, Morphling UI is automatically deployed.

Stack

Check if all Morphling UI is running successfully:

kubectl -n morphling-system get svc morphling-ui

Expected output:

NAME           TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
morphling-ui   NodePort   10.96.63.162   <none>        9091:30680/TCP   44m

If you are using minikube, you can get access to the UI with port-forward:

kubectl -n morphling-system port-forward --address 0.0.0.0 svc/morphling-ui 30263:9091

Then you can get access to the ui at http://localhost:30263/.

For detailed UI deployment and developing guide, please check UI.md

Running Examples

This example demonstrates how to tune the configuration for a mobilenet model deployed with Tensorflow Serving under Morphling.

For demonstration, we choose two configurations to tune: the first one the CPU cores (resource allocation), and the second one is maximum serving batch size (runtime parameter). We use grid search for configuration sampling.

Submit the configuration tuning experiment

kubectl -n morphling-system apply -f https://raw.githubusercontent.com/alibaba/morphling/main/examples/experiment/experiment-mobilenet-grid.yaml

Monitor the status of the configuration tuning experiment

kubectl get -n morphling-system pe
kubectl describe -n morphling-system pe

Monitor sampling trials (performance test)

kubectl -n morphling-system get trial

Get the searched optimal configuration

kubectl -n morphling-system get pe

Expected output:

NAME                        STATE       AGE   OBJECT NAME   OPTIMAL OBJECT VALUE   OPTIMAL PARAMETERS
mobilenet-experiment-grid   Succeeded   12m   qps           32                     [map[category:resource name:cpu value:4] map[category:env name:BATCH_SIZE value:32]]

Delete the tuning experiment

kubectl -n morphling-system delete pe --all

Workflow

See Morphling Workflow to check how Morphling tunes ML serving configurations automatically in a Kubernetes-native way.

Developer Guide

Build the controller manager binary

make manager

Run the tests

make test

Generate manifests, e.g., CRD, RBAC YAML files, etc.

make manifests

Build the component docker images, e.g., Morphling controller, DB-Manager

make docker-build

Push the component docker images

make docker-push

To develop/debug Morphling controller manager locally, please check the debug guide.

Community

If you have any questions or want to contribute, GitHub issues or pull requests are warmly welcome.

About

Automatic tuning for ML model deployment on Kubernetes

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • JavaScript 47.8%
  • Go 35.3%
  • Python 10.3%
  • Less 2.2%
  • EJS 1.2%
  • Dockerfile 1.1%
  • Other 2.1%