On Implementation of Neuron Network(Back-propagation)

On Implementation of Neuron
Network(Back-propagation)
Yu Liu
National Institute of Informatics
Nov 9, 2010
Yu Liu MapReduce For Machine Learning

Outline
1 Motivations
2 Brief introduction of background
The Neural Network
The Back-propagation Algorithm
The Problems of Back-propagation
3 Implementation using C++ STL, Sketo Lib and Intel TBB
and boost Mapreduce(next week)
Main Flow od data processing
Analysis of Parallelism
Optimization
4 The Benchmark Results
5 The Remained Problems

Motivation
Do more practice of parallel programming.

Motivation
Using and comparing diﬀerent parallel programming Libraries.

Motivation
Using and comparing diﬀerent parallel programming Libraries.
Studying the principle of designing a good parallel
programming Library.

MapReduce Programming model
What is MapReduce
The Computation of MapReduce Framework
Input: a set of key/value pairs
Main Concepts of MapReduce Programming Paradigm

What is MapReduce
Output: a set of key/value pairs.

What is MapReduce
The user provide two functions: Map and Reduce.

What is MapReduce
SPLIT: Splitting the input data and iterating over it;

What is MapReduce
MAP: Computation key/value pairs on each split;

What is MapReduce
SHUFFLE and SORT: Grouping intermediate values by key;

What is MapReduce
SHUFFLE and SORT: Grouping intermediate values by key;
REDUCE: Iterating over the resulting groups and reducing
each group.

The example of applying MapReduce on machine-learning
The paper: Map-Reduce for Machine Learning on Multicore give
us a programming framework model which using mapreduce
paradigm to do parallel data processing:

The Artificial Neural Network
An Artificial Neural Network (ANN) is an information processing
paradigm that is inspired by the way biological nervous systems,
such as the brain, process information.It is composed of a large
number of highly interconnected processing elements (neurones)
working in unison to solve specific problems.
R.Rojas: Neural Networks. Springer-Verlag, Berlin, 1996

The Artiﬁcial Neural Network
A simple mapreduce example
The neural network can be trained to recognise some patterns:

The Back-Propagation Algorithm
Training the NN
In order to train a neural network, must adjust the weights of each
unit to make that the error between the desired output and the
actual output is reduced.
The back-propagation algorithm:
http://galaxy.agh.edu.pl/ ∼vlsi/AI/backp t en/backprop.html

Back-Propagation concepts
1 Propagates inputs forward in the usual way, i.e.
All outputs are computed using sigmoid threshold of the inner
product of the corresponding weight and input vectors.
All outputs at stage n are connected to all the inputs at stage
n+1
2 Propagates the errors backwards by apportioning them to
each unit according to the amount of this error the unit is
responsible for.

Back-Propagation process:

3 versions of implementations
Sequential, Sketo, and TBB
A neural network(NN) c++ class was implemented and
An instance of neural network can be created by given
arguments of number of input, layers, number of neurons ...
given an input pattern it will give a (set of) output signal(s).
it has B-P method to update all neuron’s weights.
it has other methods to do all kinds of operations. E.g, put
weights , get weights.
All the operations of the neural network are sequential.

All three versions are implemented as same architecture.
The Training stage:
training data and instance of NN(BP algorithm) are inputs of
a map function;
out put of the map function are set of new weights.
input of reduce function are out put of map function.
out put of reduce function is the average of these new weights.
( here I simply average all the weights )

When training ﬁnished, this Neural Network can be used to
recognizing the unknown data
inputs of map function are unknown patterns and Neural
Network algorithm;
out put of map function are set of signals which denote what
the input data are.
no reduce processing is needed.

Sequential implementation(STL)
Training stage
The MAP and REDUCE functions:

Sequential implementation(STL)
Training stage
The function object used by MAP:

Implementation using Sketo Lib
Training stage
Forward and backward propagation:basic ideas
1 Each computing node has an instance of Neural Network with
same initial weights ;
2 This Neural Network is an extended Neural Network(each
layer has a ”1” input).
3 Let each computing node calculate same amount of training
samples;
4 Running ﬁxed steps training.
5 Sum up each node’s weights and get average.
6 Update this average weights to each NN, and calculate the
total error.
7 Repeat 4-6 until total error less than a given value

Implementation using STL, Sketo, TBB
Well Trained stage
Then the system can be use to analyse the data’s patterns:
1 Mapping/splitting the input data to each computing
node/core ;
2 Reduce is not needed.
This is a very good parallel processing: input data are all
independent.
For Sketo and TBB version implementation, the parallelism of this
stage is P, if there are P processors(cores).

TBB
I use tbb::parallel for and tbb::parallel reduce to implement the
MAP and REDUCE.
1 TBB version looks a little more complex than Sketo but also
very easy to use.
2 But TBB provides a lot of useful tools. such as concurrent
container, task management ... .

The source code: on google code:
http://skyto-neurl-network.googlecode.com/svn/trunk/skyto NN

Performance Test
On multi-core machine
Test the performance on 8-core workstation.
(weaker than “kraken” but I have GPU :D)
8-core Xeon E5620 8GB RAM 2.4GHz : Training stage:
1 million input patterns.
Neural network: 4 inputs, 4 hidden layers, each hidden layer
has 4 neurons, 4 outputs.
TBB version 3.0, Gcc4.4.5, ubuntu 10.04LST 64bit.

Performance Test
The STL version vs Sketo version( 1 core)
Training: 6.5992 s vs 7.1128 s
Using to recognize data
1.4782 s vs 1.4762 s

Performance Test
The test result of Sketo version

Performance Test
The test result of Sketo version - speedup

Performance Test
The comparison of Sketo version and TBB version (8 cores)
Traing:
Sketo: 1.0110 s , TBB: 1.3924 s
Using to recognize data
Sketo: 0.1925 s , TBB: –

The Remained Problems
The implementation is not completed yet (boost version). Some
details are not resolved:
Some B-P algorithm problems such as the ”local minima
problem” ;
Not tested with very big data ;
The size of neural network are hard to decide (still lack
knowledge of NN);

The boost Mapreduce library
The Boost.MapReduce library is a MapReduce implementation
across a plurality of CPU cores rather than machines. The library
is implemented as a set of C++ class templates, and is a
header-only library.
provide map, reduce, combine and lost of utilities;
based on Boost.FileSystem and Boost.Thread; Like Hadoop,
using ﬁles as I/O media.
Now , it is not yet part of the Boost Library and is still under
development and review.

On Implementation of Neuron Network(Back-propagation)

More Related Content

What's hot

Similar to On Implementation of Neuron Network(Back-propagation)

More from Yu Liu

Recently uploaded

On Implementation of Neuron Network(Back-propagation)