Artificial neural network
Genesis of ANN
Neural network (artificial neural network) - the common name for
mathematical structures and their software or hardware models,
performing calculations or processing of signals through the rows of
elements, called artificial neurons, performing a basic operation of your
entrance. The original structure was inspired by the natural structure of
neurons and neural systems, particularly the brain.
The neural network is a type of computer system architecture. It consists of
data processing by neurons arranged in layers. The corresponding results
are obtained through the learning process, which involves modifying the
weights of those neurons that are responsible for the error.
The Definition of ANN
Where are neural networks being used?
• Signal processing: suppress line noise, with adaptive echo canceling, blind source
separation
• Control: backing up a truck: cab position, rear position, and match with the dock get
converted to steering instructions. Manufacturing plants for controlling automated
machines.
• Siemens successfully uses neural networks for process automation in basic industries,
e.g., in rolling mill control more than 100 neural networks do their job, 24 hours a day
• Robotics - navigation, vision recognition
• Pattern recognition, i.e. recognizing handwritten characters, e.g. the current version of
Apple's Newton uses a neural net
• Medicine, storing medical records based on case information
• Speech production: reading text aloud (NETtalk)
• Vision: face recognition , edge detection, visual search engines
• Business, rules for mortgage decisions are extracted from past decisions made by
experienced evaluators, resulting in a network that has a high level of agreement with
human experts.
• Financial Applications: time series analysis, stock market prediction
• Data Compression: speech signal, image, e.g. faces
• Game Playing: chess, go, ...
The history of ANN
• 1943 - McCulloch and Pitts introduced the first neural network
computing model.
• 1950's - Rosenblatt's work resulted in a two-layer network, the
perceptron, which was capable of learning certain classifications by
adjusting connection weights. Although the perceptron was
successful in classifying certain patterns, it had a number of
limitations. The perceptron was not able to solve the classic XOR
(exclusive or) problem. Such limitations led to the decline of the
field of neural networks. However, the perceptron had laid
foundations for later work in neural computing.
• early 1980's -researchers showed renewed interest in neural
networks. Recent work includes Boltzmann machines, Hopfield
nets, competitive learning models, multilayer networks, and
adaptive resonance theory models.
Neural networks versus conventional computers
• Neural networks take a different approach to problem solving than that of
conventional computers. Conventional computers use an algorithmic approach
i.e. the computer follows a set of instructions in order to solve a problem.
• Computer can solve only the problem for which the specific steps that
computer needs to follow are known.
• Neural networks process information in a similar way the human brain does.
Neural networks learn by example. They cannot be programmed to perform a
specific task. The examples must be selected carefully otherwise useful time is
wasted or even worse the network might be functioning incorrectly. The
disadvantage is that because the network finds out how to solve the problem
by itself, its operation can be unpredictable.
• Conventional computers use a cognitive approach to problem solving; the way
the problem is to solved must be known, then converted to a high level
language program and into machine code that the computer can understand.
These machines are totally predictable; if anything goes wrong is due to a
software or hardware fault.
• Neural networks do not perform miracles. But if used sensibly they can produce
some amazing results.
Neural networks in medicine
• Artificial Neural Networks (ANN) are currently a 'hot' research area in
medicine and it is believed that they will receive extensive application to
biomedical systems in the next few years. At the moment, the research is
mostly on modelling parts of the human body and recognising diseases from
various scans (e.g. cardiograms, CAT scans, ultrasonic scans, etc.).
• Neural networks are ideal in recognising diseases using scans since there is no
need to provide a specific algorithm on how to identify the disease. Neural
networks learn by example so the details of how to recognise the disease are
not needed. What is needed is a set of examples that are representative of all
the variations of the disease. The quantity of examples is not as important as
the 'quantity'. The examples need to be selected very carefully if the system is
to perform reliably and efficiently.
Biologically Inspired
• Electro-chemical signals
• Threshold output firing
Axon
Terminal Branches
of Axon
Dendrites
The Perceptron
• Binary classifier functions
• Threshold activation function
Axon
Terminal Branches
of Axon
Dendrites
S
x1
x2
w1
w2
wn
xn
x3 w3
The Perceptron: Threshold Activation
Function
• Binary classifier functions
• Threshold activation function
Step Threshold
Linear Activation functions
• Output is scaled sum of inputs
n
N
n
n x
w
u
y 



1
Linear
Nonlinear Activation Functions
• Sigmoid Neuron unit function
u
hid
e
u
y 


1
1
)
(
Sigmoid
• The ability to learn is a fundamental trait of intelligence.
• Although a precise definition of learning is difficult to formulate, a
learning process in the ANN context can be viewed as the problem
of updating network architecture and connection weights so that a
network can efficiently perform a specific task.
• The network usually must learn the connection weights from
available training patterns.
• Performance is improved over time by iteratively updating the
weights in the network.
• ANNs' ability to automatically learn from examples makes them
attractive and exciting.
• Instead of following a set of rules specified by human experts, ANNs
appear to learn underlying rules (like input-output relationships)
from the given collection of representative examples. This is one of
the major advantages of neural networks over traditional expert
systems.
Learning – what it means exactly ?
• Learning is essential to most of neural network architectures.
• Choice of a learning algorithm is a central issue in network
development.
• What is really meant by saying that a processing element learns?
Learning implies that a processing unit is capable of changing its
input/output behavior as a result of changes in the environment.
Since the activation rule is usually fixed when the network is
constructed and since the input/output vector cannot be changed,
to change the input/output behavior the weights corresponding to
that input vector need to be adjusted. A method is thus needed by
which, at least during a training stage, weights can be modified in
response to the input/output process.
• In a neural network, learning can be supervised, in which the
network is provided with the correct answer for the output during
training, or unsupervised, in which no external teacher is present.
At learning process…
• At each training step the network computes the direction in
which each bias and link value can be changed to calculate a
more correct output.
• The rate of improvement at that solution state is also known.
A learning rate is user-designated in order to determine how
much the link weights and node biases can be modified based
on the change direction and change rate.
• The higher the learning rate (max. of 1.0) the faster the
network is trained.
• However, the network has a better chance of being trained to
a local minimum solution. A local minimum is a point at which
the network stabilizes on a solution which is not the most
optimal global solution.
learning rules
There are four basic types of learning rules:
• error correction,
• Boltzmann,
• Hebbian,
• and competitive learning.
parameters for quality the prediction
• Hidden layers: Both the number of hidden layers and the number of nodes in
each hidden layer can influence the quality of the results. For example, too
few layers and/or nodes may not be adequate to sufficiently learn and too
many may result in overtraining the network.
• Number of cycles: A cycle is where a training example is presented and the
weights are adjusted.
• The number of examples that get presented to the neural network during the
learning process can be set. The number of cycles should be set to ensure that
the neural network does not overtrain. The number of cycles is often referred
to as the number of epochs.
• Learning rate: Prior to building a neural network, the learning rate should be
set and this influences how fast the neural network learns.
Neural Network topologies
• In the previous section we discussed the properties of the basic processing unit
in an artificial neural network. This section focuses on the pattern of
connections between the units and the propagation of data. As for this pattern
of connections, the main distinction we can make is between:
• Feed-forward neural networks, where the data flow from input to output units
is strictly feedforward. The data processing can extend over multiple (layers of)
units, but no feedback connections are present, that is, connections extending
from outputs of units to inputs of units in the same layer or previous layers.
• Recurrent neural networks that do contain feedback connections. Contrary to
feed-forward networks, the dynamical properties of the network are important.
In some cases, the activation values of the units undergo a relaxation process
such that the neural network will evolve to a stable state in which these
activations do not change anymore. In other applications, the change of the
activation values of the output neurons are significant, such that the dynamical
behaviour constitutes the output of the neural network (Pearlmutter, 1990).
• Classical examples of feed-forward neural networks are the Perceptron and
Adaline. Examples of recurrent networks have been presented by Anderson
(Anderson, 1977), Kohonen (Kohonen, 1977), and Hopfield (Hopfield, 1982) .
• Volume: 1400 cm3
• Area: 2000cm2
• Weight: 1,5 kg
• Covering the hemispheres of the cerebral
cortex contains neurons: 1010
• The number of connections between cells:
10 15
• The cells send and receive signals, the
speed of operation= 1018 operations / sec
• The neural network is a simplified model of
the brain!
•Fault-tolerant;
FLEXIBLE - easily adapts to changing environment;
TEACHES THE - NOT must be programmed;
Can deal with the Information fuzzy, random, noisy or
inconsistent;
The PARALLEL HIGH DEGREE;
SMALL, very low power consumption.
Neurons and Synapses
The basic computational unit in the nervous system is the nerve cell, or
neuron. A neuron has:
1. Dendrites (inputs)
2. Cell body
3. Axon (output)
A neuron receives input from other neurons (typically many thousands).
Inputs sum (approximately). Once input exceeds a critical level, the
neuron discharges a spike - an electrical pulse that travels from the body,
down the axon, to the next neuron(s) (or other receptors). This spiking
event is also called depolarization, and is followed by a refractory period,
during which the neuron is unable to fire.
The axon endings (Output Zone) almost touch the dendrites or cell body of the next
neuron. Transmission of an electrical signal from one neuron to the next is effected by
neurotransmittors, chemicals which are released from the first neuron and which bind to
receptors in the second. This link is called a synapse. The extent to which the signal from
one neuron is passed on to the next depends on many factors, e.g. the amount of
neurotransmittor available, the number and arrangement of receptors, amount of
neurotransmittor reabsorbed, etc.
A Simple Artificial Neuron
Basic computational element (model neuron) is often called a node or unit.
It receives input from some other units, or perhaps from an external
source.
Each input has an associated weight w, which can be modified so as to
model synaptic learning. The unit computes some function f of the
weighted sum of its inputs.
Its output, in turn, can serve as input to other units.
• The weighted sum is called the net input to unit i, often written neti.
Note that wij refers to the weight from unit j to unit i (not the other way
around).
• The function f is the unit's activation function.
• In the simplest case, f is the identity function, and the unit's output is
just its net input. This is called a linear unit.
Features of intelligent system
The ability of learning from examples and generalize knowledge acquired to solve
problems posed in a new context:
• Ability to create rules (associations), binding together the separate elements
of the system (object)
• The ability to recognize objects (images features) on the basis of incomplete
information.
Data classification is one of the main tasks performed using neural networks.
What it is about ?
The purpose of classification is to associate an object based on its
characteristics of a certain category.
Data classification
Where we use the ANN?
• NO:
for calculations, multiplication tables, for word processing, etc. applications
where you can easily use the well-known algorithm.
YES:
where the algorithm procedure is very difficult to achieve, where data are
incomplete or inaccurate, where the course of the test is non-linear
phenomena, etc. Where there is a lot of data, but some results do not yet
know the methods of operation.
Artificial Neuron schema:
The inputs are fed signals from the input layer neurons in the network or
the previous one. Each signal is multiplied by the corresponding
numerical value called a weight. It affects the perception of the input
signal and its part in creating the output neuron.
Weight can be invigorating - Delay positive or - negative;
if there is no connection between neurons is the weight is zero. Summed
products of signals and weights are the neuron activation function
argument.
A simplified model of a neuron showing expressed its
similarity to the natural model
Formula that describe the neuron
working
)
(s
f
y  


n
i
i
iw
x
s
0
Where
The principle aim is to approximate a given function (in other words: learn the desired
function by observing examples of its operation).
Approximation function
Number of layers
zero one More than one
Prediction
Input: X1 X2 X3 Output: Y Model: Y = f(X1 X2 X3)
0.5
0.6 -0.1 0.1
-0.2
0.7
0.1 -0.2
X1 =1 X2=-1 X3 =2
0.2
f (0.2) = 0.55
0.55
0.9
f (0.9) = 0.71
0.71
-0.087
f (-0.087) = 0.478
0.478
0.2 = 0.5 * 1 –0.1*(-1) –0.2 * 2
Prediction Y = 0.478
If Y = 2
Then predition error = (2-0.478)
=1.522
f(x) = ex / (1 + ex)
f(0.2) = e0.2 / (1 + e0.2) = 0.55
Backpropagation
• One of the most popular techniques in
learning processes for ANN.
1. Choose randomly one
of the observation
2. Go through the
appropriate procedures
to determine the
output value
3. Compare the
desired value with
the actually
obtained in the
network
4. Adjust the weight by
calculating the error
Learning process
How to calculate the prediction error ?
where:
•Errori is the error from the i-th node,
•Outputi is the value predicted by a network,
•Actuali is the real value (which the network should learn).
Change the weights
L- is so called learning network ratio. (usually values are from
[0,1]). The less is the value of this coefficient the slower the
learning process is.
Often this ratio is set to the highest value initially, and then is
reduced by re-weighting network.
Example
Change the weights
L- is so called learning network ratio. (usually values are from
[0,1]). The less is the value of this coefficient the slower the
learning process is.
Often this ratio is set to the highest value initially, and then is
reduced by re-weighting network.
Example „step by step”
• 1 hidden layer: D, E
• Input layer: A, B, C
• Output layer: F
Randomly choosing one observation
Randomly choosing one observation
etc.…
For better understanding…
the backpropagation learning algorithm can be divided into two phases:
propagation and weight update.
Phase 1: Propagation which involves the following steps:
• Forward propagation of a training pattern's input through the neural network
in order to generate the propagation's output activations.
• Backward propagation of the propagation's output activations through the
neural network using the training pattern's target in order to generate the
deltas of all output and hidden neurons.
Phase 2: Weight update
For each weight-synapse:
• Multiply its output delta and input activation to get the gradient of the weight.
• Bring the weight in the opposite direction of the gradient by subtracting a ratio of
it from the weight.
• This ratio influences the speed and quality of learning; it is called the learning
rate. The sign of the gradient of a weight indicates where the error is increasing,
this is why the weight must be updated in the opposite direction.
• Repeat the phase 1 and 2 until the performance of the network is good enough.
The size of ANN ?
• Big NN: few thousands of neurons, or even
more.
• The number of neurons should depends on
the type of the task of network.
• The power of the network depends on the
number of the neurons, the density of the
connections between neurons, and on the
proper choosing the values of weights.
How many hidden layers it should be?
• The number of the hidden layers is usually not
higher than 2. In the hidden layers there is the
fusion of the network signals.
• Input layer is usually responsible only for the
initial preparation of input data.
• The output layer is responsible for aggregating
the final beats of the hidden layers of neurons,
and the presentation of the final result of the
network at the outputs of the neurons, which are
the outputs at the same time across the network.
Advantages of ANN
1. They can work fine in case of incomplete information
2. They do not require knowledge of the algorithm solving the
problem (automatic learning)
3. Process information in a highly parallel way
4. They can generalize (generalize to cases unknown)
5. They are resistant to partial damage
6. They can perform associative memory (associative - like working
memory in humans) as opposed to addressable memory (typical
for classical computers)
• Advantages:
• A neural network can perform tasks that a linear program can not.
• When an element of the neural network fails, it can continue
without any problem by their parallel nature.
• A neural network learns and does not need to be reprogrammed.
• It can be implemented in any application.
• It can be implemented without any problem.
•
Disadvantages: The neural network needs training to operate.
• The architecture of a neural network is different from the
architecture of microprocessors therefore needs to be emulated.
• Requires high processing time for large neural networks.
Advantages / disadvantages
Neural networks have a number of advantages:
• Linear and nonlinear models: Complex linear and nonlinear
relationships can be derived using neural networks.
• Flexible input/output: Neural networks can operate using one or
more descriptors and/or response variables. They can also be used
with categorical and continuous data.
• Noise: Neural networks are less sensitive to noise than statistical
regression models.
The major drawbacks with neural networks are:
• Black box: It is not possible to explain how the results were
calculated in any meaningful way.
• Optimizing parameters: There are many parameters to be set in a
neural network and optimizing the network can be challenging,
especially to avoid overtraining.
How many hidden layers we need ?
• The number of hidden layers usually counts 2.
The user should decide how many hidden layers
and how many neurons in each of them will be.
• Input layer usually is the same as the number of
input data (number of conditional attributes in
data set).
• The number of neurons in the output layer
depends on the type of the classication problem
(regresion, classification to some categories).
• The more neurons in hidden layer the higher
the memory occupation needed from NN.
• More neurons can makes the process of
classification overtrained and can make it too
good for training set but too bad for new –
unknown data.
• If You noticed overtraing in your neural
network You should consider the decreasing
of the number of neurons.
• Regresiion type of classification
area
Garage
age
Hitting
Where it is
floor
….
Market price
• Classification into categories
incomes
insurance
age
Marital status
employment
….
Get credit or
not ?
As output there will be estimated market price
As output there will be a decision about giving the credit or not
317 $
Yes
Categorical data is a problem…unless…
• continent: {Asia, Europe, America}
3 neurons are necessary:
• Asia 1 0 0
• Europe 0 1 0
• America 0 0 1
• One variable „continent” create 3 neurons !!!
• It would be better for such cases to consider the merging some calues
into smaller number of categories.
• Usually the number of weights should be 10 times smaller than the
number of cases in training data set.
• The STATISTICA line of software provides a comprehensive
and integrated set of tools and solutions for:
• Data analysis and reporting, data mining and predictive
modeling, business intelligence, simple and multivariate
QC, process monitoring, analytic optimization, simulation,
and for applying a large number of statistical and other
analytic techniques to address routine and advanced data
analysis needs
• Data visualization, graphical data analysis, visual data
mining, visual querying, and simple and advanced scientific
and business graphing; in fact, STATISTICA has been
acknowledged as the “king of data visualization software”
(by the editors of " PC Graphics & Video")
Install Statistica 10 EN
• http://usnet.us.edu.pl/files/statsoft/STATISTIC
A_EN_10_0.zip
Neural networks in Statistica
• Classification analysis (creditRisk.sta)
• Regression analysis (cycling.sta)
Classification for creditRisk.sta
Custom neural network CNN
Increasing neurons from 11 to 20
Automated network search ANS
Regression for cycling.sta
Choosing the variables
5 different neural network

NeuralNetwork Artificial Intellegence Material

  • 1.
  • 2.
    Genesis of ANN Neuralnetwork (artificial neural network) - the common name for mathematical structures and their software or hardware models, performing calculations or processing of signals through the rows of elements, called artificial neurons, performing a basic operation of your entrance. The original structure was inspired by the natural structure of neurons and neural systems, particularly the brain. The neural network is a type of computer system architecture. It consists of data processing by neurons arranged in layers. The corresponding results are obtained through the learning process, which involves modifying the weights of those neurons that are responsible for the error. The Definition of ANN
  • 3.
    Where are neuralnetworks being used? • Signal processing: suppress line noise, with adaptive echo canceling, blind source separation • Control: backing up a truck: cab position, rear position, and match with the dock get converted to steering instructions. Manufacturing plants for controlling automated machines. • Siemens successfully uses neural networks for process automation in basic industries, e.g., in rolling mill control more than 100 neural networks do their job, 24 hours a day • Robotics - navigation, vision recognition • Pattern recognition, i.e. recognizing handwritten characters, e.g. the current version of Apple's Newton uses a neural net • Medicine, storing medical records based on case information • Speech production: reading text aloud (NETtalk) • Vision: face recognition , edge detection, visual search engines • Business, rules for mortgage decisions are extracted from past decisions made by experienced evaluators, resulting in a network that has a high level of agreement with human experts. • Financial Applications: time series analysis, stock market prediction • Data Compression: speech signal, image, e.g. faces • Game Playing: chess, go, ...
  • 4.
    The history ofANN • 1943 - McCulloch and Pitts introduced the first neural network computing model. • 1950's - Rosenblatt's work resulted in a two-layer network, the perceptron, which was capable of learning certain classifications by adjusting connection weights. Although the perceptron was successful in classifying certain patterns, it had a number of limitations. The perceptron was not able to solve the classic XOR (exclusive or) problem. Such limitations led to the decline of the field of neural networks. However, the perceptron had laid foundations for later work in neural computing. • early 1980's -researchers showed renewed interest in neural networks. Recent work includes Boltzmann machines, Hopfield nets, competitive learning models, multilayer networks, and adaptive resonance theory models.
  • 5.
    Neural networks versusconventional computers • Neural networks take a different approach to problem solving than that of conventional computers. Conventional computers use an algorithmic approach i.e. the computer follows a set of instructions in order to solve a problem. • Computer can solve only the problem for which the specific steps that computer needs to follow are known. • Neural networks process information in a similar way the human brain does. Neural networks learn by example. They cannot be programmed to perform a specific task. The examples must be selected carefully otherwise useful time is wasted or even worse the network might be functioning incorrectly. The disadvantage is that because the network finds out how to solve the problem by itself, its operation can be unpredictable. • Conventional computers use a cognitive approach to problem solving; the way the problem is to solved must be known, then converted to a high level language program and into machine code that the computer can understand. These machines are totally predictable; if anything goes wrong is due to a software or hardware fault. • Neural networks do not perform miracles. But if used sensibly they can produce some amazing results.
  • 6.
    Neural networks inmedicine • Artificial Neural Networks (ANN) are currently a 'hot' research area in medicine and it is believed that they will receive extensive application to biomedical systems in the next few years. At the moment, the research is mostly on modelling parts of the human body and recognising diseases from various scans (e.g. cardiograms, CAT scans, ultrasonic scans, etc.). • Neural networks are ideal in recognising diseases using scans since there is no need to provide a specific algorithm on how to identify the disease. Neural networks learn by example so the details of how to recognise the disease are not needed. What is needed is a set of examples that are representative of all the variations of the disease. The quantity of examples is not as important as the 'quantity'. The examples need to be selected very carefully if the system is to perform reliably and efficiently.
  • 7.
    Biologically Inspired • Electro-chemicalsignals • Threshold output firing Axon Terminal Branches of Axon Dendrites
  • 8.
    The Perceptron • Binaryclassifier functions • Threshold activation function Axon Terminal Branches of Axon Dendrites S x1 x2 w1 w2 wn xn x3 w3
  • 9.
    The Perceptron: ThresholdActivation Function • Binary classifier functions • Threshold activation function Step Threshold
  • 10.
    Linear Activation functions •Output is scaled sum of inputs n N n n x w u y     1 Linear
  • 11.
    Nonlinear Activation Functions •Sigmoid Neuron unit function u hid e u y    1 1 ) ( Sigmoid
  • 12.
    • The abilityto learn is a fundamental trait of intelligence. • Although a precise definition of learning is difficult to formulate, a learning process in the ANN context can be viewed as the problem of updating network architecture and connection weights so that a network can efficiently perform a specific task. • The network usually must learn the connection weights from available training patterns. • Performance is improved over time by iteratively updating the weights in the network. • ANNs' ability to automatically learn from examples makes them attractive and exciting. • Instead of following a set of rules specified by human experts, ANNs appear to learn underlying rules (like input-output relationships) from the given collection of representative examples. This is one of the major advantages of neural networks over traditional expert systems.
  • 17.
    Learning – whatit means exactly ? • Learning is essential to most of neural network architectures. • Choice of a learning algorithm is a central issue in network development. • What is really meant by saying that a processing element learns? Learning implies that a processing unit is capable of changing its input/output behavior as a result of changes in the environment. Since the activation rule is usually fixed when the network is constructed and since the input/output vector cannot be changed, to change the input/output behavior the weights corresponding to that input vector need to be adjusted. A method is thus needed by which, at least during a training stage, weights can be modified in response to the input/output process. • In a neural network, learning can be supervised, in which the network is provided with the correct answer for the output during training, or unsupervised, in which no external teacher is present.
  • 18.
    At learning process… •At each training step the network computes the direction in which each bias and link value can be changed to calculate a more correct output. • The rate of improvement at that solution state is also known. A learning rate is user-designated in order to determine how much the link weights and node biases can be modified based on the change direction and change rate. • The higher the learning rate (max. of 1.0) the faster the network is trained. • However, the network has a better chance of being trained to a local minimum solution. A local minimum is a point at which the network stabilizes on a solution which is not the most optimal global solution.
  • 19.
    learning rules There arefour basic types of learning rules: • error correction, • Boltzmann, • Hebbian, • and competitive learning.
  • 20.
    parameters for qualitythe prediction • Hidden layers: Both the number of hidden layers and the number of nodes in each hidden layer can influence the quality of the results. For example, too few layers and/or nodes may not be adequate to sufficiently learn and too many may result in overtraining the network. • Number of cycles: A cycle is where a training example is presented and the weights are adjusted. • The number of examples that get presented to the neural network during the learning process can be set. The number of cycles should be set to ensure that the neural network does not overtrain. The number of cycles is often referred to as the number of epochs. • Learning rate: Prior to building a neural network, the learning rate should be set and this influences how fast the neural network learns.
  • 21.
    Neural Network topologies •In the previous section we discussed the properties of the basic processing unit in an artificial neural network. This section focuses on the pattern of connections between the units and the propagation of data. As for this pattern of connections, the main distinction we can make is between: • Feed-forward neural networks, where the data flow from input to output units is strictly feedforward. The data processing can extend over multiple (layers of) units, but no feedback connections are present, that is, connections extending from outputs of units to inputs of units in the same layer or previous layers. • Recurrent neural networks that do contain feedback connections. Contrary to feed-forward networks, the dynamical properties of the network are important. In some cases, the activation values of the units undergo a relaxation process such that the neural network will evolve to a stable state in which these activations do not change anymore. In other applications, the change of the activation values of the output neurons are significant, such that the dynamical behaviour constitutes the output of the neural network (Pearlmutter, 1990). • Classical examples of feed-forward neural networks are the Perceptron and Adaline. Examples of recurrent networks have been presented by Anderson (Anderson, 1977), Kohonen (Kohonen, 1977), and Hopfield (Hopfield, 1982) .
  • 23.
    • Volume: 1400cm3 • Area: 2000cm2 • Weight: 1,5 kg • Covering the hemispheres of the cerebral cortex contains neurons: 1010 • The number of connections between cells: 10 15 • The cells send and receive signals, the speed of operation= 1018 operations / sec • The neural network is a simplified model of the brain! •Fault-tolerant; FLEXIBLE - easily adapts to changing environment; TEACHES THE - NOT must be programmed; Can deal with the Information fuzzy, random, noisy or inconsistent; The PARALLEL HIGH DEGREE; SMALL, very low power consumption.
  • 24.
    Neurons and Synapses Thebasic computational unit in the nervous system is the nerve cell, or neuron. A neuron has: 1. Dendrites (inputs) 2. Cell body 3. Axon (output) A neuron receives input from other neurons (typically many thousands). Inputs sum (approximately). Once input exceeds a critical level, the neuron discharges a spike - an electrical pulse that travels from the body, down the axon, to the next neuron(s) (or other receptors). This spiking event is also called depolarization, and is followed by a refractory period, during which the neuron is unable to fire. The axon endings (Output Zone) almost touch the dendrites or cell body of the next neuron. Transmission of an electrical signal from one neuron to the next is effected by neurotransmittors, chemicals which are released from the first neuron and which bind to receptors in the second. This link is called a synapse. The extent to which the signal from one neuron is passed on to the next depends on many factors, e.g. the amount of neurotransmittor available, the number and arrangement of receptors, amount of neurotransmittor reabsorbed, etc.
  • 25.
    A Simple ArtificialNeuron Basic computational element (model neuron) is often called a node or unit. It receives input from some other units, or perhaps from an external source. Each input has an associated weight w, which can be modified so as to model synaptic learning. The unit computes some function f of the weighted sum of its inputs. Its output, in turn, can serve as input to other units. • The weighted sum is called the net input to unit i, often written neti. Note that wij refers to the weight from unit j to unit i (not the other way around). • The function f is the unit's activation function. • In the simplest case, f is the identity function, and the unit's output is just its net input. This is called a linear unit.
  • 26.
    Features of intelligentsystem The ability of learning from examples and generalize knowledge acquired to solve problems posed in a new context: • Ability to create rules (associations), binding together the separate elements of the system (object) • The ability to recognize objects (images features) on the basis of incomplete information. Data classification is one of the main tasks performed using neural networks. What it is about ? The purpose of classification is to associate an object based on its characteristics of a certain category. Data classification
  • 27.
    Where we usethe ANN? • NO: for calculations, multiplication tables, for word processing, etc. applications where you can easily use the well-known algorithm. YES: where the algorithm procedure is very difficult to achieve, where data are incomplete or inaccurate, where the course of the test is non-linear phenomena, etc. Where there is a lot of data, but some results do not yet know the methods of operation.
  • 28.
    Artificial Neuron schema: Theinputs are fed signals from the input layer neurons in the network or the previous one. Each signal is multiplied by the corresponding numerical value called a weight. It affects the perception of the input signal and its part in creating the output neuron. Weight can be invigorating - Delay positive or - negative; if there is no connection between neurons is the weight is zero. Summed products of signals and weights are the neuron activation function argument.
  • 29.
    A simplified modelof a neuron showing expressed its similarity to the natural model
  • 30.
    Formula that describethe neuron working ) (s f y     n i i iw x s 0 Where The principle aim is to approximate a given function (in other words: learn the desired function by observing examples of its operation). Approximation function
  • 31.
    Number of layers zeroone More than one
  • 32.
    Prediction Input: X1 X2X3 Output: Y Model: Y = f(X1 X2 X3) 0.5 0.6 -0.1 0.1 -0.2 0.7 0.1 -0.2 X1 =1 X2=-1 X3 =2 0.2 f (0.2) = 0.55 0.55 0.9 f (0.9) = 0.71 0.71 -0.087 f (-0.087) = 0.478 0.478 0.2 = 0.5 * 1 –0.1*(-1) –0.2 * 2 Prediction Y = 0.478 If Y = 2 Then predition error = (2-0.478) =1.522 f(x) = ex / (1 + ex) f(0.2) = e0.2 / (1 + e0.2) = 0.55
  • 33.
    Backpropagation • One ofthe most popular techniques in learning processes for ANN.
  • 34.
    1. Choose randomlyone of the observation 2. Go through the appropriate procedures to determine the output value 3. Compare the desired value with the actually obtained in the network 4. Adjust the weight by calculating the error Learning process
  • 35.
    How to calculatethe prediction error ? where: •Errori is the error from the i-th node, •Outputi is the value predicted by a network, •Actuali is the real value (which the network should learn).
  • 36.
    Change the weights L-is so called learning network ratio. (usually values are from [0,1]). The less is the value of this coefficient the slower the learning process is. Often this ratio is set to the highest value initially, and then is reduced by re-weighting network.
  • 38.
  • 39.
    Change the weights L-is so called learning network ratio. (usually values are from [0,1]). The less is the value of this coefficient the slower the learning process is. Often this ratio is set to the highest value initially, and then is reduced by re-weighting network.
  • 40.
    Example „step bystep” • 1 hidden layer: D, E • Input layer: A, B, C • Output layer: F
  • 41.
  • 42.
  • 50.
  • 51.
    For better understanding… thebackpropagation learning algorithm can be divided into two phases: propagation and weight update. Phase 1: Propagation which involves the following steps: • Forward propagation of a training pattern's input through the neural network in order to generate the propagation's output activations. • Backward propagation of the propagation's output activations through the neural network using the training pattern's target in order to generate the deltas of all output and hidden neurons. Phase 2: Weight update For each weight-synapse: • Multiply its output delta and input activation to get the gradient of the weight. • Bring the weight in the opposite direction of the gradient by subtracting a ratio of it from the weight. • This ratio influences the speed and quality of learning; it is called the learning rate. The sign of the gradient of a weight indicates where the error is increasing, this is why the weight must be updated in the opposite direction. • Repeat the phase 1 and 2 until the performance of the network is good enough.
  • 52.
    The size ofANN ? • Big NN: few thousands of neurons, or even more. • The number of neurons should depends on the type of the task of network. • The power of the network depends on the number of the neurons, the density of the connections between neurons, and on the proper choosing the values of weights.
  • 53.
    How many hiddenlayers it should be? • The number of the hidden layers is usually not higher than 2. In the hidden layers there is the fusion of the network signals. • Input layer is usually responsible only for the initial preparation of input data. • The output layer is responsible for aggregating the final beats of the hidden layers of neurons, and the presentation of the final result of the network at the outputs of the neurons, which are the outputs at the same time across the network.
  • 54.
    Advantages of ANN 1.They can work fine in case of incomplete information 2. They do not require knowledge of the algorithm solving the problem (automatic learning) 3. Process information in a highly parallel way 4. They can generalize (generalize to cases unknown) 5. They are resistant to partial damage 6. They can perform associative memory (associative - like working memory in humans) as opposed to addressable memory (typical for classical computers)
  • 55.
    • Advantages: • Aneural network can perform tasks that a linear program can not. • When an element of the neural network fails, it can continue without any problem by their parallel nature. • A neural network learns and does not need to be reprogrammed. • It can be implemented in any application. • It can be implemented without any problem. • Disadvantages: The neural network needs training to operate. • The architecture of a neural network is different from the architecture of microprocessors therefore needs to be emulated. • Requires high processing time for large neural networks.
  • 56.
    Advantages / disadvantages Neuralnetworks have a number of advantages: • Linear and nonlinear models: Complex linear and nonlinear relationships can be derived using neural networks. • Flexible input/output: Neural networks can operate using one or more descriptors and/or response variables. They can also be used with categorical and continuous data. • Noise: Neural networks are less sensitive to noise than statistical regression models. The major drawbacks with neural networks are: • Black box: It is not possible to explain how the results were calculated in any meaningful way. • Optimizing parameters: There are many parameters to be set in a neural network and optimizing the network can be challenging, especially to avoid overtraining.
  • 57.
    How many hiddenlayers we need ? • The number of hidden layers usually counts 2. The user should decide how many hidden layers and how many neurons in each of them will be. • Input layer usually is the same as the number of input data (number of conditional attributes in data set). • The number of neurons in the output layer depends on the type of the classication problem (regresion, classification to some categories).
  • 58.
    • The moreneurons in hidden layer the higher the memory occupation needed from NN. • More neurons can makes the process of classification overtrained and can make it too good for training set but too bad for new – unknown data. • If You noticed overtraing in your neural network You should consider the decreasing of the number of neurons.
  • 59.
    • Regresiion typeof classification area Garage age Hitting Where it is floor …. Market price • Classification into categories incomes insurance age Marital status employment …. Get credit or not ? As output there will be estimated market price As output there will be a decision about giving the credit or not 317 $ Yes
  • 60.
    Categorical data isa problem…unless… • continent: {Asia, Europe, America} 3 neurons are necessary: • Asia 1 0 0 • Europe 0 1 0 • America 0 0 1 • One variable „continent” create 3 neurons !!! • It would be better for such cases to consider the merging some calues into smaller number of categories. • Usually the number of weights should be 10 times smaller than the number of cases in training data set.
  • 62.
    • The STATISTICAline of software provides a comprehensive and integrated set of tools and solutions for: • Data analysis and reporting, data mining and predictive modeling, business intelligence, simple and multivariate QC, process monitoring, analytic optimization, simulation, and for applying a large number of statistical and other analytic techniques to address routine and advanced data analysis needs • Data visualization, graphical data analysis, visual data mining, visual querying, and simple and advanced scientific and business graphing; in fact, STATISTICA has been acknowledged as the “king of data visualization software” (by the editors of " PC Graphics & Video")
  • 63.
    Install Statistica 10EN • http://usnet.us.edu.pl/files/statsoft/STATISTIC A_EN_10_0.zip
  • 64.
    Neural networks inStatistica • Classification analysis (creditRisk.sta) • Regression analysis (cycling.sta)
  • 65.
  • 69.
  • 73.
  • 76.
  • 80.
  • 81.
  • 83.