Lecture 26 - Deploying Projects to the Cloud¶
26.1 Data Science using Cloud Computing¶
Cloud Computing, also referred to as the Cloud, delivers services hosted over a network, which can include data analytics, storage, databases, networking, and other services. The service is often in the form of a public cloud offered to the public over the Internet by cloud service providers, or it can be a private cloud that is owned by an organization that maintains the services on a private network.
Public Cloud Computing services include Amazon Web Services, Google Cloud Platform, Microsoft Azure, IBM Cloud, and others.
In general, Cloud Computing services can be categorized as:
Infrastructure as a Service (IaaS): access to infrastructure, consisting of servers, virtual machines (VMs), storage, networks, or databases.
Platform as a Service (PaaS): access to a platform for developing, testing, delivering, and managing software applications, using infrastructure managed by the provider.
Software as a Service (SaaS): access to software applications, that are developed and managed by the provider using the provider’s infrastructure.
The advantages of using Cloud Computing include convenient access to the latest computational resources (without the need to purchase hardware or software), access to structured environments (preinstalled libraries) for running tasks, pay for what you need only, ability to quickly scale projects, improved efficiency by relying on infrastructure hosted and managed by the cloud provider, etc.
Cloud Computing is especially important for managing Data Science projects, which often require access to GPUs and large compute resources, storing large amounts of data, access to databases, deploying solutions for access by end-users, and similar. In addition, most Cloud providers have developed some form of AutoML tools that enable organizations without Data Science expertise to implement data analytics workflows into their projects.
This lecture is primarily based on a course Data Science for Beginners by Microsoft. The course has several lectures on deploying Data Science projects to the Cloud, as well as it has other lectures on Data Science in general.
26.2 Introduction to Azure Machine Learning¶
Microsoft’s Azure Machine Learning is a cloud platform that provides a large number of products and services designed for handling various phases of Data Science projects. This includes capabilities for preparing and preprocessing data, training models, deploying models, and monitoring models in production. These capabilities can help to increase the efficiency of data scientists by automating many tasks and project pipelines. Understandably, the availability of cloud computing resources allows to easily scale projects and handle efficiently challenges related to processing big data and serving large a number of customers.
Important tools and services provided by Azure ML include:
Azure Machine Learning Studio: framework for data engineering, model training, and deployment.
Azure Machine Learning Designer: low-code ML framework that allows to drag-and-drop modules for building data science pipelines.
Azure Machine Learning SDK: code-based environment for data science projects.
Data Labelling: tools for automatic data labeling.
Machine Learning CLI (Command-Line Interface): allows managing Azure ML resources from the command line.
Automated Machine Learning (AutoML) User Interface: tools to automate tasks in data science projects.
MLflow: framework for tracking the performance of deployed models, and logging metrics and relevant indicators.
Azure ML allows using Jupyter Notebooks and has built-in integration with popular ML libraries like Scikit-Learn, TensorFlow, PyTorch, and others.
In this lecture, we will explore the different levels of functionality of Azure ML, ranging from the no-code AutoML, to full-code SDK, and working with our own custom models.
26.2.1 Azure Free Trial¶
Microsoft Azure has 30 days of free trial, which also can come with a $200 Azure credit that can be used within the 30 days.
Also, Azure offers $100 yearly Azure credit to students.
In addition, the other Cloud providers typically offer some amount of credit to new users and students.
Follow the link to the Microsoft Azure webpage and select the Start free
button. This will prompt you to create an Azure account, and if you wish you can use your University of Idaho account to get access to Azure.
Once you create an account and get the subscription with $200 Azure credits, the home page should look similar to the following.
26.3 No-Code Azure ML¶
26.3.1 Creating a Workspace Resource¶
From the home page, we will need to first create a new Resource that will indicate what type of tools and services we will be using.
Select `+ Create a resource`.
Azure will next display many popular services and resources.
In the search box write
Azure Machine Learning
and select it.
This will load the web page of Azure Machine Learning.
Select
Create
.
It will open a new page for Azure ML Workspace resource
. The Workspace provides a place to work with machine learning models, and allows access to tools for training and deploying models. For instance, the Workspace will store information about training runs, such as logs of various metrics, it will provide access to the data and scripts, etc. And note also that when we are done with using Azure resources such as workspaces, we need to delete the resources, otherwise some costs can be
incurred (e.g., even if we don’t use the workspace to run a model, Azure may charge a fee for storing the data).
When we create a new Workspace, we need to fill in the information shown in the screenshot below.
Subscription: Azure subscription, e.g., the $200 Azure credits obtained with the free trial.
Resource group: Assign a name for the resource group, or click on
Create new
to create a resource group.Workspace name: Assign a name for the workspace (e.g., perhaps a name that is related to the project).
Region: Select the geographical region.
Storage account: A new storage account will be created for the workspace for storing the data.
Key vault: A new key vault will be created for the workspace for storing sensitive information.
Application insights: A new application insights resource will be created for the workspace to store information about deployed models.
Container registry: Leave it as None (it will be created automatically the first time the model is deployed).
After the information in all fields is entered, select Review & create
.
The next page will show the information that we entered and we will need to confirm that everything is correct.
Select
Create
.
It may take a few minutes for the Workspace to be created. Once it is ready, the page will show that it is completed.
26.3.2 Using Azure ML Studio¶
As we mentioned in the introductory section, Azure Machine Learning Studio is a framework for data engineering, model training, and deployment.
Click on the following link to navigate to Azure ML Studio.
The interface of Azure ML Studio is shown below. On the top of the page, our workspace should be listed. In this case, the workspace that we just created and named My_workspace_1
is shown.
The various modules that are available in Azure ML Studio are listed in the left-side menu. To see the names of the modules, click on the three horizontal lines in the upper left corner. A brief description of the modules is shown in the next figure. The modules allow to conveniently apply tools for managing different phases of data science projects from a single place.
26.3.3 Loading the Dataset¶
For demonstration purposes, we will use the Heart Failure Dataset, which we used before in this course, and contains 13 columns with information about 300 patients who may or may not have risk of heart failure. The .csv
file with the records is available in the data
folder with the other files for this lecture.
Click on the
Data
module in the left-side menu in Azure ML Studio (see figure below).
The Data section provides various tools for data management, and it allows to upload files or folders with data from a local machine, or provide links to web files (e.g., data from GitHub or Google Drive), load data from a list of open datasets collected by Microsoft (check Azure Open Datasets), or use files from a datastore. Datastores allow organizations that have many data files in different locations in Azure to link them together and organize them in a single view.
Select `+ Create` to load the dataset.
Azure ML Studio will guide us through several steps for creating the dataset.
On the first page, we will need to enter a name for the dataset, a brief description, and also indicate whether the data is in tabular or other formats. After the information is entered, select Next
.
Afterward, we will indicate that the dataset is saved on our computer and we will upload the dataset from local files.
Select Next to skip Step 3 and get to Step 4 where from the dropdown menu we select Upload
- Upload files
and navigate to the directory where you have saved the csv file containing the heart failure records.
The next page will show the columns in the dataset. Select Next
to go to the next page.
In the Schema page, we will change the data type to Boolean for the columns anemia
, diabetes
, high blood pressure
, sex
, smoking
, and DEATH_EVENT
.
Afterward, click Next
and select Create
to complete the creation of the dataset.
Now we can see that the dataset heart-failure-dataset
is listed under the Data assets
in our workspace.
26.3.4 Creating a Compute Resource¶
We also need to use Compute Resources for our project to perform data preparation and processing, and to run the models. To create a compute resource, we will select Compute
from the left-side menu.
We can see that the compute resources are categorized into four tabs:
Compute Instances, are workstations for data and models; they involve creating a Virtual Machine (VM) and launching a notebook instance (e.g., compute resources to train a model are requested from the notebook).
Compute Clusters, VMs for on-demand code processing (e.g., training a model using AutoML).
Kubernetes Clusters, VMs that are orchestrated by Kubernetes.
Attached Compute, links to existing Azure compute resources, such as Virtual Machines or Azure Databricks clusters.
For this task, we can use either a compute instance or compute cluster, so let’s select the Compute instances
tab.
Click on the `+ New` button to create a new compute resource.
Let’s assign the name
heart-failure-compute
for the resource (see the figure below).
Selecting adequate compute resources for a project depends on several factors, which impose trade-offs between speed and cost.
CPU versus GPU: CPUs are less expensive, but also less powerful especially for training deep learning models. GPUs are more expensive, but they provide efficient parallel computing, and are often necessary for training deep learning models.
Cluster size: larger clusters are more expensive, but faster in completing tasks. For smaller tasks that don’t take too long, it may be better to select a small compute cluster.
VM size: similar to the cluster size, increasing the amount of RAM, number of cores, and processing speed of the VMs will reduce the computational time, but it will be more expensive.
Dedicated versus low-priority resources: dedicated resources are non-interruptible, while low-priority instances can be assigned by Azure to other tasks and interrupt the job.
For this project, we will select a VM with a CPU, and from the listed VMs we can select the one with optimized memory, which costs $0.32 per hour.
There are several optional steps that we can skip and click on Review + Create
to create the compute instance.
It can take a few minutes for the compute resource to be created.
26.3.5 Training a Model with Auto ML¶
AutoML in Azure ML Studio allows to build and deploy ML models without writing code.
Select
Automated ML
from the modules in the left-side panel.Select `+ New Automated ML job`.
The first step requires to assign a name for the experiment. Let’s simply name it
Heart_failure_experiment
.
Afterward, we need to indicate the task, that is, whether the goal is to perform classification, regression, time-series forecasting, etc. In this case, we select
classification
.For dataset, we will select the
heart-failure-dataset
that we uploaded.
Next, select a target column in the data: in this case it is
DEATH_EVENT
.We can also specify the validation type, e.g., whether we would like to use k-fold cross-validation, or whether to split the training data into train and validation sets, etc. Also, the test data asset field allows to upload a test dataset or specify how to evaluate the model. We can leave these fields at their defaults.
Finally, use the drop-down menu to assign a compute resource, e.g., select the
heart-failure-compute
that we created.Review the AutoML job and submit it.
Now the setup is complete, the experiment will begin running. This means that Azure ML will train many different models, and explore different hyperparameters for the models.
On the home page, under the Jobs
section, we will see a summary of the entered information about the experiment, and we will also see that the status of the experiment is Running
. It took about 1 hour to complete this experiment.
When the experiment is completed, in the Best model summary
we can see that the highest performance was obtained by a Voting Ensemble model, which achieved 91.526% AUC.
Also, let’s select the Models
tab to get more information about the training. We can see that over 50 models were trained in total, including LightGBM, XGBoost, Random Forest, Gradient Boosting, and running most of the models took under 1 minute. We can also see that different scaling methods were used with different algorithms (MinMaxScaler, RobustScaler, StandardScaler).
26.3.6 Deploying the Model¶
To deploy a trained model as a web service, we will select the Voting Ensemble as the best model, and from the drop-down menu under the Deploy
tab select Web service
.
In the newly opened form, we need to assign a name for the deployed model, a brief description, and compute type for the deployed model. In this case, we selected Azure Container Instance, which is suitable for low-scale CPU-based workloads, as is the model for this project. Deploying models that require large computational resources can require using other compute type with GPUs, larger RAM memory, or larger number of cores.
Next, let’s click on Deploy
to initialize this step. It took about 15 minutes for this project to be deployed. When it is completed, the Deploy Status
on the dashboard will change from Running to Succeeded.
26.3.7 Consuming the Model¶
After the model is deployed, we can find the summarized information in the Endpoints
module in the left-hand menu.
Select the
Consume
tab to access the script for consuming the model.
The Consume page will provide the REST endpoint for users’ consumption, the primary and secondary API keys for authentication, and a script for consuming the model from a local machine. The script is available in Python, C#, and R.
The Python script is shown below. The data
section represents a dictionary where the users enter information for the input features. In this script, all values are set either to 0 or False. Then, the url
in the code below is the address for the REST endpoint from the above figure, and api_key
is the primary authentication key that is listed in the above figure as well. The last code section makes a prediction for the DEATH_EVENT
, and the result is displayed.
# Note: the codes in this lecture are not required for quizzes or assignments
import urllib.request
import json
import os
import ssl
def allowSelfSignedHttps(allowed):
# bypass the server certificate verification on client side
if allowed and not os.environ.get('PYTHONHTTPSVERIFY', '') and getattr(ssl, '_create_unverified_context', None):
ssl._create_default_https_context = ssl._create_unverified_context
allowSelfSignedHttps(True) # this line is needed if you use self-signed certificate in your scoring service.
# Request data goes here
# The example below assumes JSON formatting which may be updated
# depending on the format your endpoint expects.
# More information can be found here:
# https://docs.microsoft.com/azure/machine-learning/how-to-deploy-advanced-entry-script
data = {
"Inputs": {
"data": [
{
"age": 0.0,
"anaemia": False,
"creatinine_phosphokinase": 0,
"diabetes": False,
"ejection_fraction": 0,
"high_blood_pressure": False,
"platelets": 0.0,
"serum_creatinine": 0.0,
"serum_sodium": 0,
"sex": False,
"smoking": False,
"time": 0
}
]
},
"GlobalParameters": {
"method": "predict"
}
}
body = str.encode(json.dumps(data))
url = 'http://a836b469-9573-4a63-bd31-90e8205ae13c.westus2.azurecontainer.io/score'
api_key = 'QYEGDiZFpL5OECR2aZEhhNfSfYhPvgVn' # Replace this with the API key for the web service
# The azureml-model-deployment header will force the request to go to a specific deployment.
# Remove this header to have the request observe the endpoint traffic rules
headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}
req = urllib.request.Request(url, body, headers)
try:
response = urllib.request.urlopen(req)
result = response.read()
print(result)
except urllib.error.HTTPError as error:
print("The request failed with status code: " + str(error.code))
# Print the headers - they include the requert ID and the timestamp, which are useful for debugging the failure
print(error.info())
print(error.read().decode("utf8", 'ignore'))
To consume the model, we just need to save the script to our local machine and execute it. The output is shown below. For this set of input parameters, the result for DEATH_EVENT is True
.
Let’s check the model prediction for the last record in the dataset. The input features are shown below.
"age": 50.0,
"anaemia": False,
"creatinine_phosphokinase": 196,
"diabetes": False,
"ejection_fraction": 45,
"high_blood_pressure": False,
"platelets": 395000.0,
"serum_creatinine": 1.6,
"serum_sodium": 136,
"sex": True,
"smoking": True,
"time": 285
The prediction by the model is False
as expected.
26.4 Code-based Azure ML¶
In this section, we will use Azure ML Studio to manage Data Science projects in a Python environment, that can include Jupyter Lab, Jupyter Notebooks, or VS Code. Differently from the previous section which focused on No-Code environment with Azure ML Studio, this section focuses on Code-based environment with Azure ML Studio.
We will learn how to use Azure ML to train our own custom model. For this purpose, we will define a deep learning model for classification of the MNIST dataset, and we will train and evaluate the model using Azure ML resources. Afterward, we will deploy the model, and test the deployment.
If we didn’t have access to GPUs from other sources, we could use the GPUs provided by Azure ML to train our models.
26.4.1 Creating Workspace and Compute Resource¶
Let’s log in to the Microsoft Azure webpage and create a new workspace named Workspace_2
, by following the steps listed in Section 26.3.
Afterward, we will navigate to the Azure ML Studio webpage, and in the newly created Workspace_2
, we will create a new compute resource from the Compute
module in the left-side menu. Similar to the previous section, we will select the Compute instances
tab and click on + New
. For this task, we can use a CPU VM since MNIST is a relatively small dataset. If we were to work with larger datasets and models, we would need to select a GPU VM.
26.4.2 Using Jupyter Notebooks in Azure ML¶
The created compute instance will be listed on our homepage. Note that in the Applications tab, the listed applications include Jupyter Lab
, Jupyter
, and VS Code
. We can use these applications to work with Jupyter Notebook files in the same way as we do outside of Azure ML Studio.
Let’s select Jupyter Lab
from the Applications tab for the newly created compute instance.
26.4.3 Loading the Data and Defining the Model¶
In the opened Jupyter Lab environment let’s create a new notebook for training the model using Python 3.8 - Pytorch and Tensorflow
kernel.
Let’s rename the notebook to mnist-demo
.
The code in the next cells is familiar, and it simply imports libraries and loads the MNIST dataset.
# Import libraries
import urllib.request
import tensorflow as tf
from tensorflow import keras
from keras.datasets import mnist
from keras.models import Model
from keras.layers import Input, Dense, Dropout, Flatten, Conv2D, MaxPooling2D
import numpy as np
import matplotlib.pyplot as plt
# Load the data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print('Data shape:', X_train.shape)
We will use TensorFlow-Keras library to define a simple Convolutional Neural Network for MNIST classification.
# Define the layers in the model
inputs = Input(shape=(28, 28, 1))
conv1a = Conv2D(filters=32, kernel_size=3, padding='same')(inputs)
conv1b = Conv2D(filters=64, kernel_size=3, padding='same')(conv1a)
pool1 = MaxPooling2D()(conv1b)
flat = Flatten()(pool1)
dense1 = Dense(1024, activation='relu')(flat)
dropout1 = Dropout(0.5)(dense1)
outputs = Dense(10, activation='softmax')(dropout1)
# Define the model with inputs and outputs
model = Model(inputs, outputs)
# Compile the model
model.compile(optimizer="adam", loss='sparse_categorical_crossentropy', metrics=["accuracy"])
26.4.4 Preparing Azure ML Experiment and Training the Model¶
Next, we will create an Azure ML experiment, and we will associate it with the current workspace and the subscription information. Hence, we will assign the workspace name to the current Workspace_2
using the information about our Subscription
and Resource group
.
Afterward, we will instantiate a new experiment named demo-mnist-training
that will utilize the created Azure ML workspace and resources to train and deploy the model.
from azureml.core import Workspace, Experiment
SUBSCRIPTION="...enter you 32-digit subscription number here..."
GROUP="My_resource_group_1"
WORKSPACE="Workspace_2"
# Create an instance of the Workspace class using the subscription information
ws = Workspace(
subscription_id=SUBSCRIPTION,
resource_group=GROUP,
workspace_name=WORKSPACE,
)
# Create an Azure ML experiment within the workspace "ws"
experiment = Experiment(ws, "demo-mnist-training")
Azure ML allows integration of the MLflow
framework for managing and tracking ML experiments.
In the next cell, we import MLFLow
and we will use it to automatically log the loss, accuracy, and other parameters of the training progress for the TensorFlow model with the autolog()
method.
import mlflow, mlflow.tensorflow
# Track the experiment and log the training progress
mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())
mlflow.start_run(experiment_id=experiment.id)
mlflow.tensorflow.autolog()
Next, we train the model for 5 epochs, and we can see that it achieved close to 99% train accuracy.
Afterward, we terminate the MLFlow
run, and save the model in the current directory.
# Train the model
model.fit(X_train, y_train, epochs=5)
# End the mlflow run
mlflow.end_run()
# Save the model
model.save('mnist-tf-model.h5')
26.4.5 Consuming the Model¶
To consume the model, first we will register the model with Azure ML so that it can be used for inference in the future. This will save the model under the name mnist-tf-model
and it will register it to our workspace. The registered model can be accessed from the Models
section under Assets
in the left-side panel in Azure ML Studio.
# Register the model
from azureml.core.model import Model
registered_model = Model.register(
workspace=ws,
model_name='mnist-tf-model',
model_path='mnist-tf-model.h5',
model_framework=Model.Framework.TENSORFLOW,
model_framework_version=tf.__version__)
Afterward, we will load the registered model and use it to predict the classes for several images.
# load the registered model
aml_model = Model(workspace=ws, name='mnist-tf-model', version=registered_model.version)
downloaded_model_filename = aml_model.download(exist_ok=True)
print(downloaded_model_filename)
downloaded_model = tf.keras.models.load_model(downloaded_model_filename)
# Evaluate the model
downloaded_model.evaluate(X_test, y_test, verbose=0)
# Predict the labels for several images
preds = downloaded_model.predict(X_test).argmax(axis=1)
show_images(X_test[:10], preds[:10])
As we mentioned earlier, always remember to release the used compute resources after training or predicting with a model. One alternative is to Stop
the current compute resource from running if we would like to reuse it later, or Delete
the resources if they are not needed for future use.
References¶
Microsoft course - Data Science for Beginners, available at https://github.com/microsoft/Data-Science-For-Beginners.
From No-Code to Code in Azure Machine Learning, by William VanBuskirk, available at: https://levelup.gitconnected.com/from-no-code-to-code-in-azure-machine-learning-38ee6b556de2.
Creating a TensorFlow Model with Python and Azure ML Studio, by Jarek Szczegielniak, available at https://www.codeproject.com/Articles/5321728/Python-Machine-Learning-on-Azure-Part-3-Creating-a.
BACK TO TOP