Skip to content
/ mvc Public

An open-source end-to-End ML ops controller and collaboration interface built on Google Cloud.

License

Notifications You must be signed in to change notification settings

alexlatif/mvc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MVC (Model Version Controller) ☁️ 🧠 0️⃣

An open-source end-to-end ML ops controller and collaboration interface built on Google Cloud.

version control

When it comes to version controlling datasets and machine learning models Amazon and Google don't fully support this, tools like DVC are complex, tools like Neptune are great, but f#%$ paying for ML infrastructure beyond compute. This is open-source, simple and will cover everything you need to deploy robust and monitored machine learning models on the latest Google hardware.

Features:

  1. Version control and share datasets on Google Cloud Storage.
  2. Version control and share models on Vertex AI.
  3. Create training runs on Vertex AI.
  4. Call prediction endpoints from Vertex AI.
  5. Monitor model performance during training and prediction runs.
  6. Share notebooks on Colab and Workbench with ease.

Values:

  1. Open-source
  2. Simple
  3. Pythonic
  4. ❤️

To collaborate:

  1. Request features in the discussions.
  2. To extend email me at [email protected] (pls opt to improve for all over fork)

Usage

install package

pip install git+https://github.com/alexlatif/mvc.git

config enviroment variables

  1. Set enviroment variables for credentials to GCP instance.
  2. Configure SERVICES_CONFIGED to specify which ML services MVC should be aware of.
os.environ["PROJECT_ID"] = "proj_name_on_gcp"
os.environ["REGION"] = "us-east1"
os.environ["DEPLOY_COMPUTE"] = "n1-standard-2"
# pre-configured model container
os.environ["MODEL_PREDICT_CONTAINER_URI"]  = "us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-11:latest"
# services you want the model controller to be aware of
# NOTE: requires .join to convert to string as mvc will parse the list
os.environ["SERVICES_CONFIGED"] = ",".join(["service_1", "service_2"])

initialize model version controller

import mvc as model_version_controller
mvc = model_version_controller.ModelVersionController()

version control datasets

The datasets are stored in Google Cloud Storage and are version controlled by MVC.

# list datasets available for a service
mvc.list_datasets(service_name=service_name)
# create a new dataset passing a pandas dataframe
mvc.create_dataset(df=df, dataset_name=dataset_name, service_name=service_name)
# get a dataset as a pandas dataframe
df = mvc.get_dataset(service_name=service_name, dataset_name=dataset_name)

version control models

# save tensorflow model to service
mvc.save_model(service_name=service_name, model_file_name=model_file_name, model_object=model)
# load tensorflow model from service
model_out = mvc.load_model(service_name=service_name, model_file_name=model_file_name)

enpoint predictions

Requires using the Vertex AI GUI to set the default model version for each service as well as chosing a model to deploy to an endpoint. Safer and easier this way.

res = mvc.predict_endpoint(service_name=service_name, model_name=model_file_name, x_instance=holdout_x)

TODO's

  • train online with custom image containers
  • update model version after training
  • predict batches
  • log and save training and prediction runs to GCS
  • implement tensorboards saved to GCS
  • create model summary for monitoring
  • tests ... 😅

About

An open-source end-to-End ML ops controller and collaboration interface built on Google Cloud.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages