Feedforward neural network

YONG Sopheaktra
M1
Yoshikawa-Ma Laboratory
2015/07/26
Feedforward neural networks
1
(multilayer perceptrons)

Kyoto University
• Artificial Neural Network
• Perceptron Algorithm
• Multi-layer perceptron (MLP)
• Overfitting & Regularization
Content
2

Kyoto University
• An Artificial Neural Network (ANN) is a system that is based on
biological neural network (brain).
▫ The brain has approximately 100 billion neurons, which communicate
through electro-chemical signals
▫ Each neuron receives thousands of connections (signals)
▫ If the resulting sum of signals surpasses certain threshold, the response is
sent
• The ANN attempts to recreate the computational mirror of the
biological neural network …
Artificial Neural Network
3

Kyoto University
What is Perceptron?
5
• A perceptron models a neuron
• It receives n inputs (feature vector)
• It sum those inputs , calculated, then
output
• Used for linear or binary classification

Kyoto University 6
Perceptron
• The perceptron consists of weights, the summation processor, and an
activation function
• A perceptron takes a weighted sum of inputs and outputs:

Kyoto University
Weight & Bias
7
• Bias can also be treated as another input
▫ The bias allow to shift the line
• The weights determine the slope

Kyoto University
Transfer or Activation Functions
8
• The transfer function translate the input signals to output signals
• It uses a threshold to produce an output
• Some examples are
▫ Unit Step (threshold)
▫ Sigmoid (logistic regression)
▫ Piecewise linear
▫ Gaussian

Kyoto University 9
Unit Step (Threshold)
• The output is set depending on whether the total input is greater or less
than some threshold value.

Kyoto University 10
Piecewise Linear
• The output is proportional to the total weighted output.

Kyoto University 11
Sigmoid function
• It is used when the output is expected to be a positive number
▫ It generates outputs between 0 and 1

Kyoto University 12
Gaussian
• Gaussian functions are bell-shaped curves that are continuous
• It is used in radial basis function ANN (RBF kernel – Chapter 14)
▫ Output is real value

Kyoto University 13
The learning rate
• To update the weights and bias to get smaller error
• Help us control how much we change the weight and bias

Kyoto University 14
How the algorithm work?
• Initialize the weights (zero or small random value)
• Pick a learning rate (0 – 1)
• For each training set
• Compute the activation output
▫ Adjusting
 Error = differences between predicted and actual
 Update bias and weight
• Repeating till the error is very small or zero
• If the it is linear separable, we will found solution

Kyoto University 15
https://github.com/nsadawi/perceptronPerceptron.zip/Perceptron.java

Kyoto University 16
What if the data is non-linearly separable?
• Because SLP is a linear classifier and if the data are not linearly
separable, the learning process will never find the solution
• For example: XOR problem

Kyoto University 17
Perceptron.zip/Perc.java

Kyoto University 18
XOR Classification (Xor_classification.zip)

Kyoto University 19
• A series of logistic regression models stacked on top of each other, with
the final layer being either another logistic regression or a linear
regression model, depending on whether we are solving a classification
or regression problem.
Multi-layer perceptron (MLP)

Kyoto University 21
A closer look

Kyoto University 23
• Use output error, to adjust the weights of inputs at the output layer
• Calculate the error at the previous layer and use it to adjust the weights
• Repeat this process of back-propagating errors through any number of
layers
• You may find mathematical equation of how to minimize cost function
of neural network at 16.5.4 The backpropagation algorithm
The Back Propagation Algorithm

Kyoto University 24
Convolutional neural networks
http://yann.lecun.com/exdb/lenet/index.html
• Designed to recognize visual patterns directly from pixel images with
minimal preprocessing.
• The purpose of multiple hidden units are used to learn non-linear
combination of the original inputs (feature extraction)
▫ Individual Informative
▫ Each pixel in an image is not very informative
▫ But the combination will tell

Kyoto University 26
Multiple-Classifier

Kyoto University 27
Machine-learning-ex3.zip

Kyoto University 28
Overfitting Problem

Kyoto University 29
Cross validation error

Kyoto University 30
• Simplifier the parameters/features
▫ Remove some unnecessary features
• Regularization
▫ Adjusting the weight
How to address it?

Kyoto University 31
• The MLP can overfit, esp. if the number of nodes is large
• A simple way to prevent is early stopping
▫ Stopping the training procedure when the error on the validation set first
start to increase
• Techniques are
▫ Consistent Gaussian prior
▫ Weight pruning: smaller the parameters value
▫ Soft weight sharing: group of parameters value have similar value
▫ Semi-supervised embedding: used with deep learning NN
▫ Bayesian Inference
 Determine number of hidden units – faster than cross-validation
Regularization

Kyoto University 32
Thanks You

Kyoto University
• https://www.coursera.org/learn/machine-learning
• https://www.youtube.com/playlist?list=PLea0WJq13cnCS4LLMeUuZmTx
qsqlhwUoe
• http://yann.lecun.com/exdb/lenet/index.html
Reference
33

Feedforward neural network

In this document