DATA SCIENCE
2
C o n v o l u t i o n a l N e u r a l N e t w o r k
Field of
Artificial Intelligence
Field of
M achine Learning
Bangladesh University of Professionals(BUP)
Deep Learning
3
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Input Classification
Feature Extraction
Car
Not Car
Output
Input Feature Extraction + Classification
Car
Not Car
Output
Deep Learning
Machine Learning
4
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Neural Network
As like as Human Neuron Network
5
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Neural Network
Artificial Neural Network
(ANN)
Regression and Classification
Convolutional Neural Network
(CNN)
Computer Vision
Recurrent Neural Network
(RNN)
Time Series Analysis
AN
N
6
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
CN
N
RN
N
7
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Application of CNN
Convolutional Neural Networks CNNs have the potentials to
revolutionize many industries and improve our daily life. CNN have a
wide range of applications. Such as:
 Object detection.
 Face Recognition.
 Medical Imaging.
 Analyzing Documents.
 Understanding Climate.
 Grey Areas.
 Advertising.
 Self Driving Car.
 Security Systems.
C o n v o l u t i o n a l N e u r a l N e t w o r k
8
snow covered ground, a blue and a white jacket,
a black and a gray jacket, two people standing
on skis, woman wearing a blue jacket, a blue
helmet, a blue helmet on the head, blue jeans on
the girl, a man and a snow covered mountain, a
ski lift, white snow on the ground, trees covered
in snow, a tree with no leaves.
Optical Character Recognition is
designed to convert your
handwriting into text.
Bangladesh University of Professionals(BUP)
Application
s
Bangladesh University of Professionals(BUP)
1 0
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Convolutional Neural Networks (CNN) learns multilevel features
and classifier in a joint fashion and performs much better than
traditional approaches for various image classification and
segmentation problems.
Introduction
11
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
CNN – What do they learn?
CNN - Components
There are four main components in the CNN:
1. Convolutional
2. Non-Linearity
3. Pooling or Sub Sampling
4. Classification (Fully Connected Layer)​
1 2
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
1 3
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Input
An image is a matrix of pixel
values.
If we consider a RGB image, each
pixel will have the combined values
of R, G and B.
If we consider a gray scale image,
the value of each pixel in the
matrix will range from 0 to 255.
What we see!!! What Computer See!!!
1 4
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Convolutional
The primary purpose of Convolutional in case of CNN is to
extract features from the input image.
0
0
0
0
1
1
1
1
1
4 Filter / Kernel / Feature Detector
Convolved Features /
Activation Map /
Feature Map
Image
0
0
0
1
0
0
1
1
1
1
0
0 0
0 1 1
1X1
1X1
1X1
1X0
1X0
1X1
0X0
0X0
0X1
1 5
Bangladesh University of Professionals(BUP)
C o n v o l u t i o n a l N e u r a l N e t w o r k
Convolutional…
Input Image
• The size of the output volume is controlled by three parameters that we need
to decide before the convolutional step is performed:
1 6
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Convolutional…
 Depth: Depth corresponds to the number of filters we use for the convolutional
operation.
 Stride: Stride is the number of pixels by which we slide our filter matrix over the
input matrix.
 Zero-padding: Sometimes, it is convenient to pad the input matrix with zeros
around the border, so that we can apply the filter to bordering elements of our input
image matrix.
 With zero-padding wide convolutional
 Without zero-padding narrow convolutional
1 7
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Non-Linearity (ReLU)
• Replaces all negative pixel values in the feature
map by zero.
• The purpose of ReLU is to introduce non-linearity
in CNN, since most of the real world data would
be non-linear.
• Other non-linear functions such as tanh (-1,1) or
sigmoid (0,1) can also be used instead of ReLU
(0,input)
23
18
75
101
0
25
0
20
100
25
0
35
18
0
20
15
15 20 -10 35
25
18 -110
-15 -10
20
23
18
75
101
25 100
Transfer Function
0,0
1 8
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Pooling
Reduces the dimensionality of each feature map but retains the most important
information. Pooling can be different types: Max, Average, Sum etc.
4
1
2
4
3
2
5
5
1
4
9
7
2
3
1
8 8
9
7
2x2 region
1 9
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Story so far…..
• Together these layers extract the useful features from the images.
• The output from the convolutional and pooling layers represent high-level
features of the input image.
High-level features
Feature extraction
2x2
Subsampling
5x5
Convolution
2x2
Subsampling
5x5
Convolution
input
32x32
C1
feature maps
28x28
S1
feature maps
14x14
C2
feature maps
10x10
2 0
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Fully Connected Layer
• A traditional Multi-layer Perceptron.
• The term “Fully-Connected” implies that every neuron in the previous layer is
connected to every neuron on the next layer.
• Their activations can hence be computed with a matrix multiplication followed
by a bias offset.
• The purpose of the fully connected layer is to use the high level features for
classifying the input image into various classes based on the training dataset.
2 1
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Fully Connected Layer
2 2
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Fully Connected Layer
2 3
2 4
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Overall CNN Architecture
Feature extraction
2x2
Subsampling
5x5
Convolution
2x2
Subsampling
5x5
Convolution
input
32x32
C1
feature maps
28x28
S1
feature maps
14x14
C2
feature maps
10x10
S2
feature maps
5x5
n1
n2
output
Classification
Fully
Connected
0
1
8
9
2 5
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Putting it all together – Training using Backpropagation
Step-3: Calculate the total error at the output layer (summation over all 4
classes).
Step-2: The network takes a training Image as input, goes through the
forward propagation step (convolutional, ReLU and pooling operations
along with forward propagation in the fully connected layer) and find the
output probabilities for each class.
Step-1: We initialize all filters and parameters / weights with random
values.
• Let’s say the output probabilities for the boat image above are [0.2,0.4,0.1,0.3].
• Since weight are randomly assigned for the first training example, output
probabilities are also random.
2 6
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Step-4: Use Backpropagation to calculate the gradients of the error with
respect to all weights in the network and use gradient descent to update
all filter values/weights and parameter values to minimize the output
error.
• The weights are adjusted in proportion to their contribution to the total error.
• When the same image is input again, output probabilities might now be
[0.1, 0.1, 0.7, 0.1], which is closer to the target vector [0, 0, 1, 0].
• This means that the network has learnt to classify this particular image correctly
by adjusting its weights / filters such that the output error is reduced.
• Parameters like number of filters, filter sizes, architecture of the network etc.
have all been fixed before Step 1 and do not change during training process -
only the values of the filter matrix and connection weights get updated.
Step-5: Repeat steps 2-4 with all images in the training set.
2 7
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Types of
CNN Architectures
2 8
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Year CNN Architecture Developed By
1998 LeNet Yann LeCun et al.
2012 AlexNet Alex Krizhevsky, Geoffrey Hinton and Ilya Sutskever
2013 ZFNet Matthew Zeiler and Rob Fergus
2014 GoogleNet Google
2014 VGGNet Simonyan and Zisserman
2015 ResNet Kaiming He
2017 DenseNet Gao Huang, Zhuang Liu, Laurens van der Maaten and Killian Q.Weinberger
2 9
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
GoogleNet
3 0
C o n v o l u t i o n a l N e u r a l N e t w o r k
Bangladesh University of Professionals(BUP)
Limitation
CNNs also have some drawbacks that limit their performance and
applicability. One of the main disadvantages of CNNs is that they require
a large amount of labeled data to train effectively, which can be costly and
time-consuming to obtain and annotate.
Coordinate Frame
Classification of Images with
different Positions
Illustrates the dismantled
components of a face
Convolutional Neural Networks CNN

Convolutional Neural Networks CNN

  • 2.
    DATA SCIENCE 2 C on v o l u t i o n a l N e u r a l N e t w o r k Field of Artificial Intelligence Field of M achine Learning Bangladesh University of Professionals(BUP) Deep Learning
  • 3.
    3 Bangladesh University ofProfessionals(BUP) C o n v o l u t i o n a l N e u r a l N e t w o r k Input Classification Feature Extraction Car Not Car Output Input Feature Extraction + Classification Car Not Car Output Deep Learning Machine Learning
  • 4.
    4 Bangladesh University ofProfessionals(BUP) C o n v o l u t i o n a l N e u r a l N e t w o r k Neural Network As like as Human Neuron Network
  • 5.
    5 Bangladesh University ofProfessionals(BUP) C o n v o l u t i o n a l N e u r a l N e t w o r k Neural Network Artificial Neural Network (ANN) Regression and Classification Convolutional Neural Network (CNN) Computer Vision Recurrent Neural Network (RNN) Time Series Analysis
  • 6.
    AN N 6 C o nv o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) CN N RN N
  • 7.
    7 C o nv o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Application of CNN Convolutional Neural Networks CNNs have the potentials to revolutionize many industries and improve our daily life. CNN have a wide range of applications. Such as:  Object detection.  Face Recognition.  Medical Imaging.  Analyzing Documents.  Understanding Climate.  Grey Areas.  Advertising.  Self Driving Car.  Security Systems.
  • 8.
    C o nv o l u t i o n a l N e u r a l N e t w o r k 8 snow covered ground, a blue and a white jacket, a black and a gray jacket, two people standing on skis, woman wearing a blue jacket, a blue helmet, a blue helmet on the head, blue jeans on the girl, a man and a snow covered mountain, a ski lift, white snow on the ground, trees covered in snow, a tree with no leaves. Optical Character Recognition is designed to convert your handwriting into text. Bangladesh University of Professionals(BUP) Application s
  • 9.
    Bangladesh University ofProfessionals(BUP)
  • 10.
    1 0 Bangladesh Universityof Professionals(BUP) C o n v o l u t i o n a l N e u r a l N e t w o r k Convolutional Neural Networks (CNN) learns multilevel features and classifier in a joint fashion and performs much better than traditional approaches for various image classification and segmentation problems. Introduction
  • 11.
    11 Bangladesh University ofProfessionals(BUP) C o n v o l u t i o n a l N e u r a l N e t w o r k CNN – What do they learn?
  • 12.
    CNN - Components Thereare four main components in the CNN: 1. Convolutional 2. Non-Linearity 3. Pooling or Sub Sampling 4. Classification (Fully Connected Layer)​ 1 2 Bangladesh University of Professionals(BUP) C o n v o l u t i o n a l N e u r a l N e t w o r k
  • 13.
    1 3 Bangladesh Universityof Professionals(BUP) C o n v o l u t i o n a l N e u r a l N e t w o r k Input An image is a matrix of pixel values. If we consider a RGB image, each pixel will have the combined values of R, G and B. If we consider a gray scale image, the value of each pixel in the matrix will range from 0 to 255. What we see!!! What Computer See!!!
  • 14.
    1 4 Bangladesh Universityof Professionals(BUP) C o n v o l u t i o n a l N e u r a l N e t w o r k Convolutional The primary purpose of Convolutional in case of CNN is to extract features from the input image. 0 0 0 0 1 1 1 1 1 4 Filter / Kernel / Feature Detector Convolved Features / Activation Map / Feature Map Image 0 0 0 1 0 0 1 1 1 1 0 0 0 0 1 1 1X1 1X1 1X1 1X0 1X0 1X1 0X0 0X0 0X1
  • 15.
    1 5 Bangladesh Universityof Professionals(BUP) C o n v o l u t i o n a l N e u r a l N e t w o r k Convolutional… Input Image
  • 16.
    • The sizeof the output volume is controlled by three parameters that we need to decide before the convolutional step is performed: 1 6 C o n v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Convolutional…  Depth: Depth corresponds to the number of filters we use for the convolutional operation.  Stride: Stride is the number of pixels by which we slide our filter matrix over the input matrix.  Zero-padding: Sometimes, it is convenient to pad the input matrix with zeros around the border, so that we can apply the filter to bordering elements of our input image matrix.  With zero-padding wide convolutional  Without zero-padding narrow convolutional
  • 17.
    1 7 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Non-Linearity (ReLU) • Replaces all negative pixel values in the feature map by zero. • The purpose of ReLU is to introduce non-linearity in CNN, since most of the real world data would be non-linear. • Other non-linear functions such as tanh (-1,1) or sigmoid (0,1) can also be used instead of ReLU (0,input) 23 18 75 101 0 25 0 20 100 25 0 35 18 0 20 15 15 20 -10 35 25 18 -110 -15 -10 20 23 18 75 101 25 100 Transfer Function 0,0
  • 18.
    1 8 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Pooling Reduces the dimensionality of each feature map but retains the most important information. Pooling can be different types: Max, Average, Sum etc. 4 1 2 4 3 2 5 5 1 4 9 7 2 3 1 8 8 9 7 2x2 region
  • 19.
    1 9 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Story so far….. • Together these layers extract the useful features from the images. • The output from the convolutional and pooling layers represent high-level features of the input image. High-level features Feature extraction 2x2 Subsampling 5x5 Convolution 2x2 Subsampling 5x5 Convolution input 32x32 C1 feature maps 28x28 S1 feature maps 14x14 C2 feature maps 10x10
  • 20.
    2 0 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Fully Connected Layer • A traditional Multi-layer Perceptron. • The term “Fully-Connected” implies that every neuron in the previous layer is connected to every neuron on the next layer. • Their activations can hence be computed with a matrix multiplication followed by a bias offset. • The purpose of the fully connected layer is to use the high level features for classifying the input image into various classes based on the training dataset.
  • 21.
    2 1 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Fully Connected Layer
  • 22.
    2 2 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Fully Connected Layer
  • 23.
  • 24.
    2 4 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Overall CNN Architecture Feature extraction 2x2 Subsampling 5x5 Convolution 2x2 Subsampling 5x5 Convolution input 32x32 C1 feature maps 28x28 S1 feature maps 14x14 C2 feature maps 10x10 S2 feature maps 5x5 n1 n2 output Classification Fully Connected 0 1 8 9
  • 25.
    2 5 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Putting it all together – Training using Backpropagation Step-3: Calculate the total error at the output layer (summation over all 4 classes). Step-2: The network takes a training Image as input, goes through the forward propagation step (convolutional, ReLU and pooling operations along with forward propagation in the fully connected layer) and find the output probabilities for each class. Step-1: We initialize all filters and parameters / weights with random values. • Let’s say the output probabilities for the boat image above are [0.2,0.4,0.1,0.3]. • Since weight are randomly assigned for the first training example, output probabilities are also random.
  • 26.
    2 6 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Step-4: Use Backpropagation to calculate the gradients of the error with respect to all weights in the network and use gradient descent to update all filter values/weights and parameter values to minimize the output error. • The weights are adjusted in proportion to their contribution to the total error. • When the same image is input again, output probabilities might now be [0.1, 0.1, 0.7, 0.1], which is closer to the target vector [0, 0, 1, 0]. • This means that the network has learnt to classify this particular image correctly by adjusting its weights / filters such that the output error is reduced. • Parameters like number of filters, filter sizes, architecture of the network etc. have all been fixed before Step 1 and do not change during training process - only the values of the filter matrix and connection weights get updated. Step-5: Repeat steps 2-4 with all images in the training set.
  • 27.
    2 7 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Types of CNN Architectures
  • 28.
    2 8 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Year CNN Architecture Developed By 1998 LeNet Yann LeCun et al. 2012 AlexNet Alex Krizhevsky, Geoffrey Hinton and Ilya Sutskever 2013 ZFNet Matthew Zeiler and Rob Fergus 2014 GoogleNet Google 2014 VGGNet Simonyan and Zisserman 2015 ResNet Kaiming He 2017 DenseNet Gao Huang, Zhuang Liu, Laurens van der Maaten and Killian Q.Weinberger
  • 29.
    2 9 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) GoogleNet
  • 30.
    3 0 C on v o l u t i o n a l N e u r a l N e t w o r k Bangladesh University of Professionals(BUP) Limitation CNNs also have some drawbacks that limit their performance and applicability. One of the main disadvantages of CNNs is that they require a large amount of labeled data to train effectively, which can be costly and time-consuming to obtain and annotate. Coordinate Frame Classification of Images with different Positions Illustrates the dismantled components of a face