Machine Learning with Azure

Machine Learning
with Azure
Barbara Fusinska
@BasiaFusinska

About me
Programmer
Machine Learning
Data Scientist
@BasiaFusinska
https://github.com/BasiaFusinska/AzureMLWorkshop

Agenda
• What’s Machine Learning?
• Azure ML Experiments
• Classification
• Regression
• Publishing the Web Service
• Azure Data Sources
• Resampling methods
• Machine Learning Tuning
• Exploratory Data Analysis
• Clustering
• Cortana Intelligence Gallery
• Jupyter Notebooks
• Retraining the model

What’s the reason you’re here?
What are hoping to find out?
When/How are you going to use this
knowledge?

My goals - Teaching
• What’s Machine Learning?
• How to use Azure ML Studio?
• Show how to start and where to
go next

Setup
• Clone or download
https://github.com/BasiaFusinska/Azure
MLWorkshop
• Sign up for Azure Machine Learning
Studio
https://studio.azureml.net
• Sign in to Azure Machine Learning
Studio
• Other tools: VisualStudio, Rstudio,
Python

Movies Genres
Title # Kisses # Kicks Genre
Taken 3 47 Action
Love story 24 2 Romance
P.S. I love you 17 3 Romance
Rush hours 5 51 Action
Bad boys 7 42 Action
Question:
What is the genre of
Gone with the wind
?

Data-based classification
Id Feature 1 Feature 2 Class
1. 3 47 A
2. 24 2 B
3. 17 3 B
4. 5 51 A
5. 7 42 A
Question:
What is the class of the entry
with the following features:
F1: 31, F2: 4
?

Data Visualization
0
10
20
30
40
50
60
0 10 20 30 40 50
Rule 1:
If on the left side of the
line then Class = A
Rule 2:
If on the right side of the
line then Class = B
A
B

Supervised
learning
• Classification, regression
• Label, target value
• Training & Validation
phases

Unsupervised
learning
• Clustering, feature
selection
• Finding structure of data
• Statistical values
describing the data

Supervised Machine Learning workflow
Clean data Data split
Machine Learning
algorithm
Trained model Score
Preprocess
data
Training
data
Test data

Publishing the model
Machine Learning
Model
Model Training
Published
Machine Learning
Model
Prediction
Training data
Publish model
Test stream
Scores

Data -> Predictive model -> Operational web API in minutes
APIML STUDIO

Classification problem
Model training
Data & Labels

Classification data
Source #Links #Characters ... Fake
TopNews 10 2750 … T
Twitter 2 120 … F
TopNews 235 502 … F
Channel X 1530 3024 … T
Twitter 24 70 … F
StoryLeaks 722 1408 … T
Facebook 98 230 … T
… … … … ...
Features
Labels

Iris Dataset
• Features:
• Sepal length
• Sepal width
• Petal length
• Petal width
• Species:
• Setosa
• Versicolor
• Virginica
http://archive.ics.uci.edu/ml/datasets/Iris

Data
classification:
Two-class Iris
Demo

Evaluation methods for classification
Confusion
Matrix
Reference
Positive Negative
Prediction
Positive TP FP
Negative FN TN
Receiver Operating Characteristic
curve
Area under the curve
(AUC)
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
#𝑐𝑜𝑟𝑟𝑒𝑐𝑡
#𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠
=
𝑇𝑃 + 𝑇𝑁
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
𝑇𝑁
𝑇𝑁 + 𝐹𝑁
How good at avoiding
false alarms
How good it is at
detecting positives

https://azure.microsoft.com/en-gb/pricing/details/machine-learning/

K-Nearest Neighbours Algorithm
• Object is classified by a majority
vote
• k – algorithm parameter
• Distance metrics: Euclidean
(continuous variables), Hamming
(text)
?

Naïve Bayes classifier
𝑝 𝐶 𝑘 𝒙) =
𝑝 𝐶 𝑘 𝑝 𝒙 𝐶 𝑘)
𝑝(𝒙)
𝒙 = (𝑥1, … , 𝑥 𝑘)
𝑝 𝐶 𝑘 𝑥1, … , 𝑥 𝑘) likelihood
evidence
prior
posterior

Naïve Bayes example
Sex Height Weight Foot size
Male 6 190 11
Male 6.2 170 10
Female 5 130 6
… … … …
Sex Height Weight Foot size
? 5.9 140 8
𝑝 𝑚𝑎𝑙𝑒 𝒙 =
𝑝 𝑚𝑎𝑙𝑒 𝑝 5.9 𝑚𝑎𝑙𝑒 𝑝 140 𝑚𝑎𝑙𝑒 𝑝(8|𝑚𝑎𝑙𝑒)
𝑒𝑣𝑖𝑑𝑒𝑛𝑐𝑒
𝑒𝑣𝑖𝑑𝑒𝑛𝑐𝑒 = 𝑝 𝑚𝑎𝑙𝑒 𝑝 5.9 𝑚𝑎𝑙𝑒 𝑝 140 𝑚𝑎𝑙𝑒 𝑝 8 𝑚𝑎𝑙𝑒 +
𝑝 𝑓𝑒𝑚𝑎𝑙𝑒 𝑝 5.9 𝑓𝑒𝑚𝑎𝑙𝑒 𝑝 140 𝑓𝑒𝑚𝑎𝑙𝑒 𝑝(8|𝑓𝑒𝑚𝑎𝑙𝑒)
𝑝 𝑓𝑒𝑚𝑎𝑙𝑒 𝒙 =
𝑝 𝑓𝑒𝑚𝑎𝑙𝑒 𝑝 5.9 𝑓𝑒𝑚𝑎𝑙𝑒 𝑝 140 𝑓𝑒𝑚𝑎𝑙𝑒 𝑝(8|𝑓𝑒𝑚𝑎𝑙𝑒)
𝑒𝑣𝑖𝑑𝑒𝑛𝑐𝑒

Logistic regression
𝑧 = 𝛽0 + 𝛽1 𝑥1 + ⋯ + 𝛽 𝑘 𝑥 𝑘
𝑦 =
1 𝑓𝑜𝑟 𝑧 > 0
0 𝑓𝑜𝑟 𝑧 < 0
𝑦 =
1 𝑓𝑜𝑟 𝜙(𝑧) > 0.5
0 𝑓𝑜𝑟 𝜙(𝑧) < 0.5
Logistic function
Coefficients
Best fit of β

Decision trees
• Use the information gain and
entropy
• Finding the feature that best
splits the dataset
• Build the tree
• Prune the tree

Task: Adult Centus
Income Prediction
• Built-in dataset sample
• Data exploration
• Classification statement
• Data split
• Training
• Performance evaluation
• Results visualisation
https://archive.ics.uci.edu/ml/datasets/census+income

Task: Data
preparation
• Data exploration
• Missing data
• Feature selection

Publishing the
experiment
Demo
API

Task: Publishing
income prediction
• Set up predictive experiment
• Set up the Web Service
• Deploy the Web Service
• Additionally:
• Remove income from the request
• Only return Scores

Azure ML data sources
• Built-in datasets
• Uploaded data
• Import Data module:
• Web URL via HTTP
• Hive Query
• SQL Database (Azure SQL or Azure VM)
• Azure Table
• Azure Blob Storage
• Data Feed Provider (OData)
• Azure CosmosDB

Task: Upload
dataset
• Download the Prestige.csv file
• Add dataset to Azure ML Studio
• Upload the downloaded file

Regression problem
• Dependent value
• Predicting the real value
• Fitting the coefficients
• Analytical solutions
• Gradient descent
𝑓 𝒙 = 𝛽0 + 𝛽1 𝑥1 + ⋯ + 𝛽 𝑘 𝑥 𝑘

Ordinary linear regression
Residual sum of squares (RSS)
𝑆 𝑤 =
𝑖=1
𝑛
(𝑦𝑖 − 𝑥𝑖
𝑇
𝑤)2
= 𝑦 − 𝑋𝑤 𝑇
𝑦 − 𝑋𝑤
𝑤 = 𝑎𝑟𝑔 min
𝑤
𝑆(𝑤)

Evaluation methods for regression
• Errors
𝑅𝑀𝑆𝐸 = 𝑖=1
𝑛
(𝑓𝑖 − 𝑦𝑖)2
𝑛
𝑅2 = 1 −
(𝑓𝑖 − 𝑦𝑖)2
( 𝑦 − 𝑦𝑖)2
• Statistics (t, ANOVA)

Residuals vs
Fitted
• Check if residuals have non-
linear patterns
• Check if the model captures
the non-linear relationship
• Should show equally spread
residuals around the
horizontal line

Normal Q-Q
• Shows if the residuals are
normally distributed
• Values should be lined on the
straight dashed line
• Check if residuals do not
deviate severely

Scale-Location
• Show if residuals are spread
equally along the ranges of
predictors
• Test the assumption of equal
variance (homoscedasticity)
• Should show horizontal line
with equally (randomly)
spread points

Residuals vs
Leverage
• Helps to find influential cases
• When outside of the Cook’s
distance the cases are
influential
• With no influential cases
Cook’s distance lines should
be barely visible

Task: Prestige EDA
• Descriptive statistics (dimensions,
rows, columns, data types,
correlation)
• Distributions, correlations, outliers
• Handle missing data
• Features significance

Categorical data for regression
• Categories: A, B, C are coded as
dummy variables
• In general if the variable has k
categories it will be decoded into
k-1 dummy variables
Category V1 V2
A 0 0
B 1 0
C 0 1
𝑓 𝒙 = 𝛽0 + 𝛽1 𝑥1 + ⋯ + 𝛽𝑗 𝑥𝑗 + 𝛽𝑗+1 𝑣1 + ⋯ + 𝛽𝑗+𝑘−1 𝑣 𝑘

Categorical data for regression
𝑓 𝑥 = 𝛽0 + 𝛽1 𝑥 + 𝛽2 𝑣1 + ⋯ + 𝛽 𝑘 𝑣 𝑘−1 +
𝛽 𝑘+1 𝑣1 𝑥 + ⋯ + 𝛽2𝑘−1 𝑣 𝑘−1 𝑥
𝑦 ~ 𝑥 + 𝑐𝑎𝑡 + 𝑥: 𝑐𝑎𝑡

Task: Prestige
Regression
• Numeric and categorical features
• Linear regression training
• Algorithm evaluation
• Set Up the Web Service

Task: Cross-
validation
• Use income prediction
classification
• Replace splitting data to train and
test with cross-validation
• Algorithm evaluation

Machine Learning Tuning
• Data preparation
• Data cleansing
• Normalisation
• Removing/Adding duplicates
• Algorithms
• Comparing different methods
• Adjusting algorithm to the
problem
• Hyperparameters

Task: Tuning
• Tune the Income Classification
problem
• Use Decision Tree classification
algorithm
• Tune the parameters using range
of values
• Performance evaluation

Task: Compare
different
algorithms
• Use Income prediction experiment
• Use four different classification
algorithm
• Compare algorithms performances

Exploratory Data Analysis
• Descriptive statistics
(dimensions, rows, columns,
data types, correlation)
• Data visualization (distributions,
outliers)
• Missing data
• Duplicate data
• Data transformations
• Features significance

Task: Flights delays
EDA
• Dataset EDA
• Build in datasets
• Join Airport codes & Airport names
• Join Weather dataset
• Set up categorical data
• Clean missing data
• Check for duplicates

Task: Flights delays
predictions
• Remove target leaking features
• Classification problem
• Define the target value
• Train the model
• Regression problem
• Define the target value
• Use linear regression

Customising the process
• Programming languages: R &
Python
• R Scripts
• R Models
• Python Scripts

R Script
# Map 1-based optional input ports to variables
dataset1 <- maml.mapInputPort(1) # class: data.frame
dataset2 <- maml.mapInputPort(2) # class: data.frame
# Contents of optional Zip port are in ./src/
# source("src/yourfile.R");
# load("src/yourData.rdata");
# Sample operation
data.set = rbind(dataset1, dataset2);
# You'll see this output in the R Device port.
# It'll have your stdout, stderr and PNG graphics device(s).
plot(data.set);
# Select data.frame to be sent to the output Dataset port
maml.mapOutputPort("data.set");

Python Script
# The script MUST contain a function named azureml_main
# which is the entry point for this module.
# imports up here can be used to
import pandas as pd
# The entry point function can contain up to two input arguments:
# Param<dataframe1>: a pandas.DataFrame
# Param<dataframe2>: a pandas.DataFrame
def azureml_main(dataframe1 = None, dataframe2 = None):
# Execution logic goes here
print('Input pandas.DataFrame #1:rnrn{0}'.format(dataframe1))
# If a zip file is connected to the third input port is connected,
# it is unzipped under ".Script Bundle". This directory is added
# to sys.path. Therefore, if your zip file contains a Python file
# mymodule.py you can import it using:
# import mymodule
# Return value must be of a sequence of pandas.DataFrame
return dataframe1,

R model: Trainer
# Input: dataset
# Output: model
# The code below is an example which can be replaced with your own code.
# See the help page of "Create R Model" module for the list of predefined
functions and constants.
library(e1071)
features <- get.feature.columns(dataset)
labels <- as.factor(get.label.column(dataset))
train.data <- data.frame(features, labels)
feature.names <- get.feature.column.names(dataset)
names(train.data) <- c(feature.names, "Class")
model <- naiveBayes(Class ~ ., train.data)

R model: Scorer
# Input: model, dataset
# Output: scores
# The code below is an example which can be replaced with your own code.
# See the help page of "Create R Model" module for the list of predefined
functions and constants.
library(e1071)
probabilities <- predict(model, dataset, type="raw")[,2]
classes <- as.factor(as.numeric(probabilities >= 0.5))
scores <- data.frame(classes, probabilities)

Hierarchical clustering
• Decision of where the cluster
should be split
• Metric: distance between pairs
of observation
• Linkage criterion: dissimilarity of
sets

Evaluating
methods for
clustering
• Sum of squares
• Class based measures
• Underlying true

Task: Income
Clustering
• Use Adult Census Income dataset
• Clustering using k-means
algorithm
• Compare clusters with the original
classes assignments
• Visualise the findings

Cortana Intelligence Gallery
https://gallery.cortanaintelligence.com/

Task: Twitter
sentiment
• Find Twitter sentiment Experiment
• Open the experiment in Azure ML
Studio
• Run the experiment and visualise
the results

Jupyter Notebooks
• Running cells
• Markdown documentation
• Different kernels
• Visualisation

Azure
Notebooks
Demo
https://notebooks.azure.com/

Retraining the model
• Set up Retraining Web Service
• Output node connected with the
saved model
• New training dataset
• Batch execution

Keep in touch
BarbaraFusinska.com
Barbara@Fusinska.com
@BasiaFusinska

Machine Learning with Azure

More Related Content

What's hot

Similar to Machine Learning with Azure

More from Barbara Fusinska

Recently uploaded

Machine Learning with Azure