Convolution Neural Network Lecture Slides

CS295: Modern Systems:
Application Case Study
Neural Network Accelerator
Sang-Woo Jun
Spring 2019
Many slides adapted from
Hyoukjun Kwon‘s Gatech “Designing CNN Accelerators”

Usefulness of Deep Neural Networks
 No need to further emphasize the obvious

Convolutional Neural Network for
Image/Video Recognition

ImageNet Top-5 Classification Accuracy
Over the Years
image-net.org “ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2017,” 2017
AlexNet, The Beginning
15 million images 1000 classes in the ImageNet challenge
“The first* fast**
GPU-accelerated Deep Convolutional Neural Network
to win an image recognition contest

Convolutional Neural Networks Overview
Convolution
Layer
Fully
Connected
Layer
Convolution
Layer
Convolution
Layer
Fully
Connected
Layer
Fully
Connected
Layer
goldfish: 0.002%
shark: 0.08%
magpie: 0. 02%
Palace: 89%
…
…
Paper towel: 1.4%
Spatula: 0.001%
…
… …
“Convolution” “Neural Network”
…

Training vs. Inference
 Training: Tuning parameters using training data
o Backpropagation using stochastic gradient descent is the most popular algorithm
o Training in data centers and distributing trained data is a common model*
o Because training algorithm changes rapidly, GPU cluster is the most popular
hardware (Low demand for application-specific accelerators)
 Inference: Determining class of a new input data
o Using a trained model, determine class of a new input data
o Inference usually occurs close to clients
o Low-latency and power-efficiency is required
(High demand for application specific accelerators)

Deep Neural Networks (“Fully Connected”*)
Chris Edwards, “Deep Learning Hunts for Signals Among the Noise,” Communications of the ACM, June 2018
 Each layer may have a different number of neurons
goldfish: 0.002%
Palace: 89%
Paper towel: 1.4%
Spatula: 0.001%

An Artificial Neuron
 Effectively weight vector multiplied
by input vector to obtain a scalar
 May apply activation function to
output
o Adds non-linearity
Sigmoid Rectified Linear Unit
(ReLU)
Jed Fox, “Neural Networks 101,” 2017

Convolution Layer
31 7 44
65 35 40
46 29 32
33
46
30
24 49 8 64
65 46
46 64
Convolution layer Optional pooling layer

Convolution Example
1 2 3
-2 0 -1
5 -2 4
Channel partial sum[0][0] =
1 x 0 + 2 x 1 + 3 x 0
+ (-2) x 2 + 0 x 4 + (-1) x 3
+ 5 x 5 + (-2) x 2 + 4 x 7
= 44
44
0 1 0
2 4 3
5 2 7
1 0 1
1 0 0
2 1 5
4 1 8
5 0 1
0 0 0
4 2 8
5 8 3
5 2 6 Channel partial sum[0][0] =
1 x 0 + 2 x 1 + 3 x 0
+ (-2) x 2 + 0 x 4 + (-1) x 3
+ 5 x 5 + (-2) x 2 + 4 x 7
= 44
44 -1
Typically adds zero padding to source matrix
to maintain dimensions
Convolution
Filter
Input map Output map
× =

Multidimensional Convolution
 “Feature Map” usually has multiple layers
o An image has R, G, B layers, or “channels”
 One layer has many convolution filters, which create a multichannel
output map
1 2 3
-2 0 -1
5 -2 4
Input feature map 3x3x3 filter
×
Output feature map
=

Multiple Convolutions
Filter 0
Filter 1
Input feature map
Output feature map 0
Output feature map 1

Example Learned Convolution Filters
Alex Krizhevsky et al., “ImageNet Classification with Deep Convolutional Neural Networks,” NIPS, 2012

Multidimensional Convolution
Image found online. Original source unknown

Computation in the Convolution Layer
for(n=0; n<N; n++) { // Input feature maps (IFMaps)
for(m=0; m<M; m++) { // Weight Filters
for(c=0; c<C; c++) { // IFMap/Weight Channels
for(y=0; y<H; y++) { // Input feature map row
for(x=0; x<H; x++) { // Input feature map column
for(j=0; j<R; j++) { // Weight filter row
for(i=0; i<R; i++) { // Weight filter column
O[n][m][x][y] += W[m][c][i][j] * I[n][c][y+j][x+i]}}}}}}}

Pooling Layer
 Reduces size of the feature map
o Max pooling, Average pooling, …
31 7 44
65 35 40
46 29 32
33
46
30
24 49 8 64
65 46
46 64
Max pooling example

Real Convolutional Neural Network
-- AlexNet
Alex Krizhevsky et al., “ImageNet Classification with Deep Convolutional Neural Networks,” NIPS, 2012
96 11x11x3 kernels 256 5x5x48 384 3x3x128 …
Simplified intuition: Higher order information at later layer

Real Convolutional Neural Network
-- VGG 16
Heuritech blog (https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/)
Contains 138 million weights and
15.5G MACs to process one 224 × 224 input image

There are Many, Many Neural Networks
 GoogLeNet, ResNet, YOLO, …
o Share common building blocks, but look drastically different
GoogLeNet (ImageNet 2014 winner)
ResNet
(ImageNet 2015 winner)

Convolution Neural Network Lecture Slides

More Related Content

Similar to Convolution Neural Network Lecture Slides

Recently uploaded

Convolution Neural Network Lecture Slides