NeuralNetwork Artificial Intellegence Material

Genesis of ANN
Neural network (artificial neural network) - the common name for
mathematical structures and their software or hardware models,
performing calculations or processing of signals through the rows of
elements, called artificial neurons, performing a basic operation of your
entrance. The original structure was inspired by the natural structure of
neurons and neural systems, particularly the brain.
The neural network is a type of computer system architecture. It consists of
data processing by neurons arranged in layers. The corresponding results
are obtained through the learning process, which involves modifying the
weights of those neurons that are responsible for the error.
The Definition of ANN

Where are neural networks being used?
• Signal processing: suppress line noise, with adaptive echo canceling, blind source
separation
• Control: backing up a truck: cab position, rear position, and match with the dock get
converted to steering instructions. Manufacturing plants for controlling automated
machines.
• Siemens successfully uses neural networks for process automation in basic industries,
e.g., in rolling mill control more than 100 neural networks do their job, 24 hours a day
• Robotics - navigation, vision recognition
• Pattern recognition, i.e. recognizing handwritten characters, e.g. the current version of
Apple's Newton uses a neural net
• Medicine, storing medical records based on case information
• Speech production: reading text aloud (NETtalk)
• Vision: face recognition , edge detection, visual search engines
• Business, rules for mortgage decisions are extracted from past decisions made by
experienced evaluators, resulting in a network that has a high level of agreement with
human experts.
• Financial Applications: time series analysis, stock market prediction
• Data Compression: speech signal, image, e.g. faces
• Game Playing: chess, go, ...

The history of ANN
• 1943 - McCulloch and Pitts introduced the first neural network
computing model.
• 1950's - Rosenblatt's work resulted in a two-layer network, the
perceptron, which was capable of learning certain classifications by
adjusting connection weights. Although the perceptron was
successful in classifying certain patterns, it had a number of
limitations. The perceptron was not able to solve the classic XOR
(exclusive or) problem. Such limitations led to the decline of the
field of neural networks. However, the perceptron had laid
foundations for later work in neural computing.
• early 1980's -researchers showed renewed interest in neural
networks. Recent work includes Boltzmann machines, Hopfield
nets, competitive learning models, multilayer networks, and
adaptive resonance theory models.

Neural networks versus conventional computers
• Neural networks take a different approach to problem solving than that of
conventional computers. Conventional computers use an algorithmic approach
i.e. the computer follows a set of instructions in order to solve a problem.
• Computer can solve only the problem for which the specific steps that
computer needs to follow are known.
• Neural networks process information in a similar way the human brain does.
Neural networks learn by example. They cannot be programmed to perform a
specific task. The examples must be selected carefully otherwise useful time is
wasted or even worse the network might be functioning incorrectly. The
disadvantage is that because the network finds out how to solve the problem
by itself, its operation can be unpredictable.
• Conventional computers use a cognitive approach to problem solving; the way
the problem is to solved must be known, then converted to a high level
language program and into machine code that the computer can understand.
These machines are totally predictable; if anything goes wrong is due to a
software or hardware fault.
• Neural networks do not perform miracles. But if used sensibly they can produce
some amazing results.

Neural networks in medicine
• Artificial Neural Networks (ANN) are currently a 'hot' research area in
medicine and it is believed that they will receive extensive application to
biomedical systems in the next few years. At the moment, the research is
mostly on modelling parts of the human body and recognising diseases from
various scans (e.g. cardiograms, CAT scans, ultrasonic scans, etc.).
• Neural networks are ideal in recognising diseases using scans since there is no
need to provide a specific algorithm on how to identify the disease. Neural
networks learn by example so the details of how to recognise the disease are
not needed. What is needed is a set of examples that are representative of all
the variations of the disease. The quantity of examples is not as important as
the 'quantity'. The examples need to be selected very carefully if the system is
to perform reliably and efficiently.

Biologically Inspired
• Electro-chemical signals
• Threshold output firing
Axon
Terminal Branches
of Axon
Dendrites

The Perceptron
• Binary classifier functions
• Threshold activation function
Axon
Terminal Branches
of Axon
Dendrites
S
x1
x2
w1
w2
wn
xn
x3 w3

The Perceptron: Threshold Activation
Function
• Binary classifier functions
• Threshold activation function
Step Threshold

Linear Activation functions
• Output is scaled sum of inputs
n
N
n
n x
w
u
y 



1
Linear

Nonlinear Activation Functions
• Sigmoid Neuron unit function
u
hid
e
u
y 


1
1
)
(
Sigmoid

• The ability to learn is a fundamental trait of intelligence.
• Although a precise definition of learning is difficult to formulate, a
learning process in the ANN context can be viewed as the problem
of updating network architecture and connection weights so that a
network can efficiently perform a specific task.
• The network usually must learn the connection weights from
available training patterns.
• Performance is improved over time by iteratively updating the
weights in the network.
• ANNs' ability to automatically learn from examples makes them
attractive and exciting.
• Instead of following a set of rules specified by human experts, ANNs
appear to learn underlying rules (like input-output relationships)
from the given collection of representative examples. This is one of
the major advantages of neural networks over traditional expert
systems.

Learning – what it means exactly ?
• Learning is essential to most of neural network architectures.
• Choice of a learning algorithm is a central issue in network
development.
• What is really meant by saying that a processing element learns?
Learning implies that a processing unit is capable of changing its
input/output behavior as a result of changes in the environment.
Since the activation rule is usually fixed when the network is
constructed and since the input/output vector cannot be changed,
to change the input/output behavior the weights corresponding to
that input vector need to be adjusted. A method is thus needed by
which, at least during a training stage, weights can be modified in
response to the input/output process.
• In a neural network, learning can be supervised, in which the
network is provided with the correct answer for the output during
training, or unsupervised, in which no external teacher is present.

At learning process…
• At each training step the network computes the direction in
which each bias and link value can be changed to calculate a
more correct output.
• The rate of improvement at that solution state is also known.
A learning rate is user-designated in order to determine how
much the link weights and node biases can be modified based
on the change direction and change rate.
• The higher the learning rate (max. of 1.0) the faster the
network is trained.
• However, the network has a better chance of being trained to
a local minimum solution. A local minimum is a point at which
the network stabilizes on a solution which is not the most
optimal global solution.

learning rules
There are four basic types of learning rules:
• error correction,
• Boltzmann,
• Hebbian,
• and competitive learning.

parameters for quality the prediction
• Hidden layers: Both the number of hidden layers and the number of nodes in
each hidden layer can influence the quality of the results. For example, too
few layers and/or nodes may not be adequate to sufficiently learn and too
many may result in overtraining the network.
• Number of cycles: A cycle is where a training example is presented and the
weights are adjusted.
• The number of examples that get presented to the neural network during the
learning process can be set. The number of cycles should be set to ensure that
the neural network does not overtrain. The number of cycles is often referred
to as the number of epochs.
• Learning rate: Prior to building a neural network, the learning rate should be
set and this influences how fast the neural network learns.

Neural Network topologies
• In the previous section we discussed the properties of the basic processing unit
in an artificial neural network. This section focuses on the pattern of
connections between the units and the propagation of data. As for this pattern
of connections, the main distinction we can make is between:
• Feed-forward neural networks, where the data flow from input to output units
is strictly feedforward. The data processing can extend over multiple (layers of)
units, but no feedback connections are present, that is, connections extending
from outputs of units to inputs of units in the same layer or previous layers.
• Recurrent neural networks that do contain feedback connections. Contrary to
feed-forward networks, the dynamical properties of the network are important.
In some cases, the activation values of the units undergo a relaxation process
such that the neural network will evolve to a stable state in which these
activations do not change anymore. In other applications, the change of the
activation values of the output neurons are significant, such that the dynamical
behaviour constitutes the output of the neural network (Pearlmutter, 1990).
• Classical examples of feed-forward neural networks are the Perceptron and
Adaline. Examples of recurrent networks have been presented by Anderson
(Anderson, 1977), Kohonen (Kohonen, 1977), and Hopfield (Hopfield, 1982) .

• Volume: 1400 cm3
• Area: 2000cm2
• Weight: 1,5 kg
• Covering the hemispheres of the cerebral
cortex contains neurons: 1010
• The number of connections between cells:
10 15
• The cells send and receive signals, the
speed of operation= 1018 operations / sec
• The neural network is a simplified model of
the brain!
•Fault-tolerant;
FLEXIBLE - easily adapts to changing environment;
TEACHES THE - NOT must be programmed;
Can deal with the Information fuzzy, random, noisy or
inconsistent;
The PARALLEL HIGH DEGREE;
SMALL, very low power consumption.

Neurons and Synapses
The basic computational unit in the nervous system is the nerve cell, or
neuron. A neuron has:
1. Dendrites (inputs)
2. Cell body
3. Axon (output)
A neuron receives input from other neurons (typically many thousands).
Inputs sum (approximately). Once input exceeds a critical level, the
neuron discharges a spike - an electrical pulse that travels from the body,
down the axon, to the next neuron(s) (or other receptors). This spiking
event is also called depolarization, and is followed by a refractory period,
during which the neuron is unable to fire.
The axon endings (Output Zone) almost touch the dendrites or cell body of the next
neuron. Transmission of an electrical signal from one neuron to the next is effected by
neurotransmittors, chemicals which are released from the first neuron and which bind to
receptors in the second. This link is called a synapse. The extent to which the signal from
one neuron is passed on to the next depends on many factors, e.g. the amount of
neurotransmittor available, the number and arrangement of receptors, amount of
neurotransmittor reabsorbed, etc.

A Simple Artificial Neuron
Basic computational element (model neuron) is often called a node or unit.
It receives input from some other units, or perhaps from an external
source.
Each input has an associated weight w, which can be modified so as to
model synaptic learning. The unit computes some function f of the
weighted sum of its inputs.
Its output, in turn, can serve as input to other units.
• The weighted sum is called the net input to unit i, often written neti.
Note that wij refers to the weight from unit j to unit i (not the other way
around).
• The function f is the unit's activation function.
• In the simplest case, f is the identity function, and the unit's output is
just its net input. This is called a linear unit.

Features of intelligent system
The ability of learning from examples and generalize knowledge acquired to solve
problems posed in a new context:
• Ability to create rules (associations), binding together the separate elements
of the system (object)
• The ability to recognize objects (images features) on the basis of incomplete
information.
Data classification is one of the main tasks performed using neural networks.
What it is about ?
The purpose of classification is to associate an object based on its
characteristics of a certain category.
Data classification

Where we use the ANN?
• NO:
for calculations, multiplication tables, for word processing, etc. applications
where you can easily use the well-known algorithm.
YES:
where the algorithm procedure is very difficult to achieve, where data are
incomplete or inaccurate, where the course of the test is non-linear
phenomena, etc. Where there is a lot of data, but some results do not yet
know the methods of operation.

Artificial Neuron schema:
The inputs are fed signals from the input layer neurons in the network or
the previous one. Each signal is multiplied by the corresponding
numerical value called a weight. It affects the perception of the input
signal and its part in creating the output neuron.
Weight can be invigorating - Delay positive or - negative;
if there is no connection between neurons is the weight is zero. Summed
products of signals and weights are the neuron activation function
argument.

A simplified model of a neuron showing expressed its
similarity to the natural model

Formula that describe the neuron
working
)
(s
f
y  


n
i
i
iw
x
s
0
Where
The principle aim is to approximate a given function (in other words: learn the desired
function by observing examples of its operation).
Approximation function

Number of layers
zero one More than one

Prediction
Input: X1 X2 X3 Output: Y Model: Y = f(X1 X2 X3)
0.5
0.6 -0.1 0.1
-0.2
0.7
0.1 -0.2
X1 =1 X2=-1 X3 =2
0.2
f (0.2) = 0.55
0.55
0.9
f (0.9) = 0.71
0.71
-0.087
f (-0.087) = 0.478
0.478
0.2 = 0.5 * 1 –0.1*(-1) –0.2 * 2
Prediction Y = 0.478
If Y = 2
Then predition error = (2-0.478)
=1.522
f(x) = ex / (1 + ex)
f(0.2) = e0.2 / (1 + e0.2) = 0.55

Backpropagation
• One of the most popular techniques in
learning processes for ANN.

1. Choose randomly one
of the observation
2. Go through the
appropriate procedures
to determine the
output value
3. Compare the
desired value with
the actually
obtained in the
network
4. Adjust the weight by
calculating the error
Learning process

How to calculate the prediction error ?
where:
•Errori is the error from the i-th node,
•Outputi is the value predicted by a network,
•Actuali is the real value (which the network should learn).

Change the weights
L- is so called learning network ratio. (usually values are from
[0,1]). The less is the value of this coefficient the slower the
learning process is.
Often this ratio is set to the highest value initially, and then is
reduced by re-weighting network.

Example „step by step”
• 1 hidden layer: D, E
• Input layer: A, B, C
• Output layer: F

Randomly choosing one observation

For better understanding…
the backpropagation learning algorithm can be divided into two phases:
propagation and weight update.
Phase 1: Propagation which involves the following steps:
• Forward propagation of a training pattern's input through the neural network
in order to generate the propagation's output activations.
• Backward propagation of the propagation's output activations through the
neural network using the training pattern's target in order to generate the
deltas of all output and hidden neurons.
Phase 2: Weight update
For each weight-synapse:
• Multiply its output delta and input activation to get the gradient of the weight.
• Bring the weight in the opposite direction of the gradient by subtracting a ratio of
it from the weight.
• This ratio influences the speed and quality of learning; it is called the learning
rate. The sign of the gradient of a weight indicates where the error is increasing,
this is why the weight must be updated in the opposite direction.
• Repeat the phase 1 and 2 until the performance of the network is good enough.

The size of ANN ?
• Big NN: few thousands of neurons, or even
more.
• The number of neurons should depends on
the type of the task of network.
• The power of the network depends on the
number of the neurons, the density of the
connections between neurons, and on the
proper choosing the values of weights.

How many hidden layers it should be?
• The number of the hidden layers is usually not
higher than 2. In the hidden layers there is the
fusion of the network signals.
• Input layer is usually responsible only for the
initial preparation of input data.
• The output layer is responsible for aggregating
the final beats of the hidden layers of neurons,
and the presentation of the final result of the
network at the outputs of the neurons, which are
the outputs at the same time across the network.

Advantages of ANN
1. They can work fine in case of incomplete information
2. They do not require knowledge of the algorithm solving the
problem (automatic learning)
3. Process information in a highly parallel way
4. They can generalize (generalize to cases unknown)
5. They are resistant to partial damage
6. They can perform associative memory (associative - like working
memory in humans) as opposed to addressable memory (typical
for classical computers)

• Advantages:
• A neural network can perform tasks that a linear program can not.
• When an element of the neural network fails, it can continue
without any problem by their parallel nature.
• A neural network learns and does not need to be reprogrammed.
• It can be implemented in any application.
• It can be implemented without any problem.
•
Disadvantages: The neural network needs training to operate.
• The architecture of a neural network is different from the
architecture of microprocessors therefore needs to be emulated.
• Requires high processing time for large neural networks.

Advantages / disadvantages
Neural networks have a number of advantages:
• Linear and nonlinear models: Complex linear and nonlinear
relationships can be derived using neural networks.
• Flexible input/output: Neural networks can operate using one or
more descriptors and/or response variables. They can also be used
with categorical and continuous data.
• Noise: Neural networks are less sensitive to noise than statistical
regression models.
The major drawbacks with neural networks are:
• Black box: It is not possible to explain how the results were
calculated in any meaningful way.
• Optimizing parameters: There are many parameters to be set in a
neural network and optimizing the network can be challenging,
especially to avoid overtraining.

How many hidden layers we need ?
• The number of hidden layers usually counts 2.
The user should decide how many hidden layers
and how many neurons in each of them will be.
• Input layer usually is the same as the number of
input data (number of conditional attributes in
data set).
• The number of neurons in the output layer
depends on the type of the classication problem
(regresion, classification to some categories).

• The more neurons in hidden layer the higher
the memory occupation needed from NN.
• More neurons can makes the process of
classification overtrained and can make it too
good for training set but too bad for new –
unknown data.
• If You noticed overtraing in your neural
network You should consider the decreasing
of the number of neurons.

• Regresiion type of classification
area
Garage
age
Hitting
Where it is
floor
….
Market price
• Classification into categories
incomes
insurance
age
Marital status
employment
….
Get credit or
not ?
As output there will be estimated market price
As output there will be a decision about giving the credit or not
317 $
Yes

Categorical data is a problem…unless…
• continent: {Asia, Europe, America}
3 neurons are necessary:
• Asia 1 0 0
• Europe 0 1 0
• America 0 0 1
• One variable „continent” create 3 neurons !!!
• It would be better for such cases to consider the merging some calues
into smaller number of categories.
• Usually the number of weights should be 10 times smaller than the
number of cases in training data set.

• The STATISTICA line of software provides a comprehensive
and integrated set of tools and solutions for:
• Data analysis and reporting, data mining and predictive
modeling, business intelligence, simple and multivariate
QC, process monitoring, analytic optimization, simulation,
and for applying a large number of statistical and other
analytic techniques to address routine and advanced data
analysis needs
• Data visualization, graphical data analysis, visual data
mining, visual querying, and simple and advanced scientific
and business graphing; in fact, STATISTICA has been
acknowledged as the “king of data visualization software”
(by the editors of " PC Graphics & Video")

Install Statistica 10 EN
• http://usnet.us.edu.pl/files/statsoft/STATISTIC
A_EN_10_0.zip

Neural networks in Statistica
• Classification analysis (creditRisk.sta)
• Regression analysis (cycling.sta)

Classification for creditRisk.sta

Increasing neurons from 11 to 20

NeuralNetwork Artificial Intellegence Material

More Related Content

Similar to NeuralNetwork Artificial Intellegence Material

Recently uploaded

NeuralNetwork Artificial Intellegence Material