Skip to content

Latest commit

 

History

History
259 lines (170 loc) · 11.8 KB

challenge_01.md

File metadata and controls

259 lines (170 loc) · 11.8 KB

Hints for Challenge 1

Setup part

In the Azure Portal, create a new Machine Learning service workspace resource:

alt text

  • Workspace name: azure-ml-bootcamp
  • Resource Group: azure-ml-bootcamp
  • Location: East US (cheaper than West Europe and for our bootcamp sufficient)

alt text

Let's have a look at our Resource Group:

alt text

  • Application Insights - used for monitoring our models in production (will be used later)
  • Storage account - this will store our logs, model outputs, training/testing data, etc.
  • Key vault - stores our secrets (will be used later)
  • Machine Learning service workspace - the center point for Machine Learning on Azure

Creating a Compute VM

Inside our Machine Learning service workspace, we'll create a new Compute VM:

alt text

Hit + New,keep it as STANDARD_D3_V2 and give it a unique name:

alt text

It'll take a few minutes until the VM has been created. The primary use for this VM is that we all have the same Jupyter environment. In this exercise, we'll use this VM to train a simple Machine Learning model. In a real-world setup, we might consider using a GPU-enable instance, in case we need to perform Deep Learning or just rely on Azure Machine Learning Compute (challenge 2).

It'll take ~3 minutes until the VM is provisioned and ready to use.

Once it is running, the UI will already give us a links to Jupyter, JupyterLab and RStudio. To keep things simple, we'll use Jupyter throughout this bootcamp, but if you feel adventurous, use JupyerLab or RStudio solving the challenges in R.

alt text

You'll be using your AAD (Azure Active Directory) user to log into Jupyter. From a enterprise security point, this is a big plus. No extra credentials needed! 🙌

Initial Azure Machine Learning Setup

Inside the newly created Compute VM, first create a new folder via the New button on the top right of Jupyter. Everything we'll do in this workshop should happen in this folder. This is because Machine Learning Services will persist the whole contents of the experiment's folder, which exceeds the limit when you run your Jupyter Notebooks in the root folder.

alt text

Note: The next block is not needed anymore (as of May 2019), but you'd need it if you want to connect to your Azure Machine Learning Workspace from e.g., your local machine. Since the Compute VM runs inside the workspace, it automatically connects to the workspace it lives in.

~~Next, create a text file called `config.json` (also via the `New` button) and replace the values with your own (you'll find your Subscription ID in the Azure Portal at the top of your Resource Group):~~
# Ignore this block, unless you run Jupyer directly on e.g., your laptop
{
    "subscription_id": "xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx",
    "resource_group": "azure-ml-bootcamp",
    "workspace_name": "azure-ml-bootcamp"
}

alt text

The config.json is used by the Azure Machine Learning SDK to connect to your Azure Machine Learning workspace running in Azure.

Finally, we can click the New button and create a new Notebook of type: Python 3.6 - AzureML. A new browser tab should open up and we can click the name Untitled and rename it to challenge01.ipynb.

alt text

Training a basic Machine Learning model

Inside your challenge01.ipynb notebook, create a new cell:

from azureml.core import Workspace, Experiment, Run

ws = Workspace.from_config()

You can run or re-run any cell by hitting Run or pressing Shift+Enter or Ctrl+Enter. Code cells have brackets left to them. If the brackets are empty [ ], the code has not been run. While the code is running, you will see an asterisk [*]. After the code completes, a number [1] appears. The number tells you in which order the cells ran. You can always re-run arbitrary cells, in case something didn't work on the first try.

This first cell imports the relevant libraries from the Azure Machine Learning SDK, reads our config.json and connects the notebook to our Machine Learning Workspace in Azure. You will need to authenticate to your Azure subscription:

alt text

Have a look at the following note when experiencing subscription ID errors (this should not happen any more when using a Azure Compute VM):

If you are using multiple subscriptions or tenants, it might be required to tell the Jupyter Notebook, which one it should use. Hence, create a new cell and adapt the following code to use your subscription id (the one you have used in `config.json`):
!az account set -s "xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx"

Once you have ran the cell, restart the Notebook kernel (Kernel --> Restart & Clear Output) and wait a few seconds until it has restarted.

Next, let's create a new experiment (this will later show up in our Workspace after you've ran the first experiment). This is where all our experiment runs will be logged to:

experiment = Experiment(workspace = ws, name = "scikit-learn-mnist")

Let's load some test data into our Compute VM (we'll do something more scalable in the next challenge):

import os
import urllib.request

os.makedirs('./data', exist_ok = True)

urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz', filename='./data/train-images.gz')
urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz', filename='./data/train-labels.gz')
urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', filename='./data/test-images.gz')
urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz', filename='./data/test-labels.gz')

Let's create a fourth cell for training our model:

import numpy as np
import gzip
import struct
from sklearn.linear_model import LogisticRegression
from sklearn.externals import joblib

# load compressed MNIST gz files we just downloaded and return numpy arrays
def load_data(filename, label=False):
    with gzip.open(filename) as gz:
        struct.unpack('I', gz.read(4))
        n_items = struct.unpack('>I', gz.read(4))
        if not label:
            n_rows = struct.unpack('>I', gz.read(4))[0]
            n_cols = struct.unpack('>I', gz.read(4))[0]
            res = np.frombuffer(gz.read(n_items[0] * n_rows * n_cols), dtype=np.uint8)
            res = res.reshape(n_items[0], n_rows * n_cols)
        else:
            res = np.frombuffer(gz.read(n_items[0]), dtype=np.uint8)
            res = res.reshape(n_items[0], 1)
    return res

# We need to scale our data to values between 0 and 1
X_train = load_data('./data/train-images.gz', False) / 255.0
y_train = load_data('./data/train-labels.gz', True).reshape(-1)
X_test = load_data('./data/test-images.gz', False) / 255.0
y_test = load_data('./data/test-labels.gz', True).reshape(-1)

# Tell our Azure ML Workspace that a new run is starting
run = experiment.start_logging()

# Create a Logistic Regression classifier and train it
clf = LogisticRegression(multi_class='auto')
clf.fit(X_train, y_train)

# Predict classes of our testing dataset
y_pred = clf.predict(X_test)

# Calculate accuracy
acc = np.average(y_pred == y_test)
print('Accuracy is', acc)

# Log accuracy to our Azure ML Workspace
run.log('accuracy', acc)

# Tell our Azure ML Workspace that the run has completed
run.complete()

On our STANARD_D3_V2 instance, the code should take around ~1 minutes to run (any warnings you get can be ignored).

In summary, the code does the following things:

  1. Imports sklearn (scikit-learn) as the Machine Learning framework
  2. Creates a helper function for loading our data (load_data(...))
  3. Loads our MNIST train and test data, and scales all values to [0, 1]
  4. Tells our Azure ML Experiment to start logging a training run
  5. Creates a LogisticRegression-based classifier and trains it using the training data
  6. Uses the classifier to predict the numbers in the test dataset
  7. Compares the predictions to the ground truth and calculates the accuracy score
  8. Logs to accuracy to our run and finishes the run

As we can see, our model achieves ~92% accuracy, which is actually pretty low for the MNIST dataset - we'll get back to this in the next challenge!

In the Azure ML Workspace, we can see that our experiment is finally showing up:

alt text

Inside our experiment, we can see our first run:

alt text

If we click the run number, we can see its details:

alt text

We can track more values or even time series, which would directly show up as diagrams. However, as we want to keep the code short, we'll skip this part for now (more on that in challenge 2).

Finally, we can export our model and upload it to our Azure ML Workspace in the outputs directory:

from sklearn.externals import joblib

# Write model to disk
joblib.dump(value=clf, filename='scikit-learn-mnist.pkl')

# Upload our model to our experiment
run.upload_file(name = 'outputs/scikit-learn-mnist.pkl', path_or_stream = './scikit-learn-mnist.pkl')

In the portal, we can now see the output of our run:

alt text

We can also query our tracked metrics and outputs for our current run:

print("Run metrics:", run.get_metrics())
print("Run model files", run.get_file_names())

As a last step, we can register (version, tag, and store) our model in our workspace:

model = run.register_model(model_name='scikit-learn-mnist-model', model_path='outputs/scikit-learn-mnist.pkl')
print(model.name, model.id, model.version, sep = '\t')

We probably would not do this for every model we train, but for those that we want to promote to the next stage and potentially consider for deployment.

Under the Models tab, we can now see that our model has been registered:

alt text

Our model has been stored in the Storage Account that has been created initially for us:

alt text

At this point:

  • We've trained a Machine Learning model using scikit-learn inside a Compute VM running Jupyter
  • We achieved 92% accuracy (not very good for this data set)
  • Azure ML knows about our experiment and our initial run and tracked metrics
  • Azure ML saved our model file (scikit-learn-mnist.pkl) in Blob storage
  • We have registered our initial model as a Azure ML Model in our Workspace

(Bonus) Compute VM Details

If we have another look into our resource group azure-ml-bootcamp, we can see that the Compute VM actually sits inside this group. It is just a regular Azure Virtual Machine:

alt text

Furthermore, we can go into our Workspace and also see it listed under Compute:

alt text

In the next challenge, we'll build a more powerful model and use Azure Machine Learning Compute to train it on a remote cluster.