1
Artificial Neurons, Neural
Networks and Architectures
Fall 2007
Instructor: Tai-Ye (Jason) Wang
Department of Industrial and Information Management
Institute of Information Management
2/48
Neuron Abstraction
 Neurons transduce signals—electrical to
chemical, and from chemical back again
to electrical.
 Each synapseis associated with what we
call the synaptic efficacy —the
efficiency with which a signal is
transmitted from the presynaptic to
postsynaptic neuron
S
3/48
Neuron Abstraction
S
4/48
Neuron Abstraction:
Activations and Weights
 the jth artificial neuron
that receives input
signals si , from possibly
n different sources
 an internal activation xj
which is a linear
weighted aggregation of
the impinging signals,
modified by an internal
threshold, θj
S
5/48
Neuron Abstraction:
Activations and Weights
 the j th artificial neuron
that connection
weights wij model the
synaptic efficacies of
various interneuron
synapses.
S
6/48
Notation:
 wij denotes the
weight from neuron
i to neuron j .
S
7/48
Neuron Abstraction: Signal Function
 The activation of the neuron
is subsequently transformed
through a signal function
S(·)
 Generates the output signal
sj = S(xj ) of the neuron.
S
8/48
Neuron Abstraction: Signal Function
 a signal function may
typically be
 binary threshold
 linear threshold
 sigmoidal
 Gaussian
 probabilistic.
S
9/48
Activations Measure Similarities
 The activation xj is simply the binner
product of the impinging signal vector
S = (s0, . . . , sn)T , with the neuronal weight
vector Wj = (w0j , . . . ,wnj )T
Adaptiv
e
Filter
10/48
Neuron Signal Functions:
Binary Threshold Signal Function
 Net positive
activations translate
to a +1 signal value
 Net negative
activations translate
to a 0 signal value.
11/48
Neuron Signal Functions:
Binary Threshold Signal Function
 The threshold logic
neuron is a two state
machine
 sj = S(xj ) ∈ {0, 1}
12/48
Neuron Signal Functions:
Binary Threshold Signal Function
13/48
Threshold Logic Neuron (TLN)
in Discrete Time
 The updated signal value S(xj
k+1) at time instant
k + 1 is generated from the neuron activation
xi
k+1 , sampled at time instant k + 1.
 The response of the threshold logic neuron as a
two-state machine can be extended to the
bipolar case where the signals are
 sj ∈ {−1, 1}
14/48
Threshold Logic Neuron (TLN)
in Discrete Time
 The resulting signal function is then none
other than the signum function, sign(x)
commonly encountered in communication
theory.
15/48
Interpretation of Threshold
 From the point of view of the net activation xj
 the signal is +1 if xj = qj + θj ≥ 0, or qj ≥ −θj ;
 and is 0 if qj < −θj .
16/48
Interpretation of Threshold
 The neuron thus “compares” the net external
input qj
 if qj is greater than the negative threshold, it fires
+1, otherwise it fires 0.
17/48
Linear Threshold Signal Function
 αj = 1/xm is the slope
parameter of the
function
 Figure plotted for xm =
2 and αj = 0.5. .
18/48
Linear Threshold Signal Function
 Sj (xj ) = max(0,
min(αjxj , 1))
 Note that in this
course we assume that
neurons within a
network are
homogeneous.
19/48
Sigmoidal Signal Function
 λj is a gain scale factor
 In the limit, as λj →∞
the smooth logistic
function approaches
the non-smooth binary
threshold function.
20/48
Sigmoidal Signal Function
 The sigmoidal
signal function has
some very useful
mathematical
properties. It is
 monotonic
 continuous
 bounded
21/48
Gaussian Signal Function
 σj is the Gaussian spread
factor and cj is the center.
 Varying the spread
makes the function
sharper or more diffuse.
22/48
Gaussian Signal Function
 Changing the center
shifts the function to the
right or left along the
activation axis
 This function is an
example of a non-
monotonic signal
function
23/48
Stochastic Neurons
 The signal is assumed to be two state
 sj ∈ {0, 1} or {−1, 1}
 Neuron switches into these states depending
upon a probabilistic function of its
activation, P(xj ).
24/48
Summary of Signal Functions
25/48
Neural Networks Defined
 Artificial neural networks are massively
parallel adaptive networks of simple
nonlinear computing elements called
neurons which are intended to abstract and
model some of the functionality of the
human nervous system in an attempt to
partially capture some of its computational
strengths.
26/48
Eight Components of Neural
Networks
 Neurons. These can be of three types:
 Input: receive external stimuli
 Hidden: compute intermediate functions
 Output: generate outputs from the network
 Activation state vector. This is a vector of
the activation level xi of individual neurons
in the neural network,
 X = (x1, . . . , xn)T ∈ Rn.
27/48
Eight Components of Neural
Networks
 Signal function. A function that generates the
output signal of the neuron based on its
activation.
 Pattern of connectivity. This essentially determines the
inter-neuron connection architecture or the graph of the
network. Connections which model the inter-neuron
synaptic efficacies, can be
 excitatory (+)
 inhibitory (−)
 absent (0).
28/48
Eight Components of Neural
Networks
 Activity aggregation rule. A way of
aggregating activity at a neuron, and is
usually computed as an inner product of the
input vector and the neuron fan-in weight
vector.
29/48
Eight Components of Neural
Networks
 Activation rule. A function that determines
the new activation level of a neuron on the
basis of its current activation and its
external inputs.
 Learning rule. Provides a means of
modifying connection strengths based both
on external stimuli and network
performance with an aim to improve the
latter.
30/48
Eight Components of Neural
Networks
 Environment. The environments within
which neural networks can operate could be
 deterministic (noiseless) or
 stochastic (noisy).
31/48
Architectures:
Feedforward and Feedback
 Local groups of neurons can be connected in
either,
 a feedforward architecture, in which the network
has no loops, or
 a feedback (recurrent) architecture, in which loops
occur in the network because of feedback
connections.
32/48
Architectures:
Feedforward and Feedback
33/48
Neural Networks Generate Mappings
 Multilayered networks that associate vectors from
one space to vectors of another space are called
heteroassociators.
 Map or associate two different patterns with one
another—one as input and the other as output.
Mathematically we write, f : Rn → Rp.
f : Rn → Rp
34/48
Neural Networks Generate Mappings
 When neurons in a
single field connect
back onto themselves
the resulting network is
called an autoassociator
since it associates a
single pattern in Rn with
itself.
f : Rn → Rn
35/48
Activation and Signal State Spaces
 For a p-dimensional field of neurons, the
activation state space is Rp.
 The signal state space is the Cartesian cross
space,
 Ip = [0, 1]×· · ·×[0, 1] p times = [0, 1]p ⊂ Rp if
the neurons have continuous signal functions in
the interval [0, 1]
 [−1, 1]p if the neurons have continuous signal
functions in the interval [−1, 1].
36/48
Activation and Signal State Spaces
 For the case when the neuron signal
functions are binary threshold, the signal
state space is
 Bp = {0, 1}×· · ·×{0, 1} p times = {0, 1}p ⊂ Ip
⊂ Rp
 {−1, 1}p when the neuron signal functions are
bipolar threshold.
37/48
Feedforward vs Feedback:
Multilayer Perceptrons
 Organized into different layers
 Unidirectional connections
 memory-less: output depends only on the present
input
X ∈ Rn S = f (X)
38/48
Feedforward vs Feedback:
Multilayer Perceptrons
 Possess no dynamics
 Demonstrate powerful properties
 Universal function approximation
 Find widespread applications in pattern
classification.
X ∈ Rn S = f (X)
39/48
Feedforward vs Feedback:
Recurrent Neural Networks
 Non-linear dynamical
systems
 New state of the
network is a function
of the current input
and the present state
of the network
40/48
Feedforward vs Feedback:
Recurrent Neural Networks
 Possess a rich repertoire
of dynamics
 Capable of performing
powerful tasks such as
 pattern completion
 topological feature
mapping
 pattern recognition
41/48
More on Feedback Networks
 Network activations and signals are in a flux of
change until they settle down to a steady value
 Issue of Stability: Given a feedback network
architecture we must ensure that the network
dynamics leads to behavior that can be interpreted
in a sensible way.
42/48
More on Feedback Networks
 Dynamical systems have variants of
behavior like
 fixed point equilibria where the system
eventually converges to a fixed point
 Chaotic dynamics where the system
wanders aimless in state space
43/48
Summary of Major Neural Networks
Models
44/48
Summary of Major Neural Networks
Models
45/48
Salient Properties of Neural
Networks
 Robustness Ability to operate, albeit with
some performance loss, in the event of
damage to internal structure.
 Associative Recall Ability to invoke related
memories from one concept.
 For e.g. a friend’s name elicits vivid mental
pictures and related emotions
46/48
Salient Properties of Neural
Networks
 Function Approximation and Generalization
Ability to approximate functions using
learning algorithms by creating internal
representations and hence not requiring the
mathematical model of how outputs depend
on inputs. So neural networks are often
referred to as adaptive function estimators.
47/48
Application Domains of Neural
Networks
Associative Recall
Fault Tolerance
48/48
Application Domains of Neural
Networks
Function Approximation
Control
Prediction

Introduction to Artificial neural network

  • 1.
    1 Artificial Neurons, Neural Networksand Architectures Fall 2007 Instructor: Tai-Ye (Jason) Wang Department of Industrial and Information Management Institute of Information Management
  • 2.
    2/48 Neuron Abstraction  Neuronstransduce signals—electrical to chemical, and from chemical back again to electrical.  Each synapseis associated with what we call the synaptic efficacy —the efficiency with which a signal is transmitted from the presynaptic to postsynaptic neuron S
  • 3.
  • 4.
    4/48 Neuron Abstraction: Activations andWeights  the jth artificial neuron that receives input signals si , from possibly n different sources  an internal activation xj which is a linear weighted aggregation of the impinging signals, modified by an internal threshold, θj S
  • 5.
    5/48 Neuron Abstraction: Activations andWeights  the j th artificial neuron that connection weights wij model the synaptic efficacies of various interneuron synapses. S
  • 6.
    6/48 Notation:  wij denotesthe weight from neuron i to neuron j . S
  • 7.
    7/48 Neuron Abstraction: SignalFunction  The activation of the neuron is subsequently transformed through a signal function S(·)  Generates the output signal sj = S(xj ) of the neuron. S
  • 8.
    8/48 Neuron Abstraction: SignalFunction  a signal function may typically be  binary threshold  linear threshold  sigmoidal  Gaussian  probabilistic. S
  • 9.
    9/48 Activations Measure Similarities The activation xj is simply the binner product of the impinging signal vector S = (s0, . . . , sn)T , with the neuronal weight vector Wj = (w0j , . . . ,wnj )T Adaptiv e Filter
  • 10.
    10/48 Neuron Signal Functions: BinaryThreshold Signal Function  Net positive activations translate to a +1 signal value  Net negative activations translate to a 0 signal value.
  • 11.
    11/48 Neuron Signal Functions: BinaryThreshold Signal Function  The threshold logic neuron is a two state machine  sj = S(xj ) ∈ {0, 1}
  • 12.
    12/48 Neuron Signal Functions: BinaryThreshold Signal Function
  • 13.
    13/48 Threshold Logic Neuron(TLN) in Discrete Time  The updated signal value S(xj k+1) at time instant k + 1 is generated from the neuron activation xi k+1 , sampled at time instant k + 1.  The response of the threshold logic neuron as a two-state machine can be extended to the bipolar case where the signals are  sj ∈ {−1, 1}
  • 14.
    14/48 Threshold Logic Neuron(TLN) in Discrete Time  The resulting signal function is then none other than the signum function, sign(x) commonly encountered in communication theory.
  • 15.
    15/48 Interpretation of Threshold From the point of view of the net activation xj  the signal is +1 if xj = qj + θj ≥ 0, or qj ≥ −θj ;  and is 0 if qj < −θj .
  • 16.
    16/48 Interpretation of Threshold The neuron thus “compares” the net external input qj  if qj is greater than the negative threshold, it fires +1, otherwise it fires 0.
  • 17.
    17/48 Linear Threshold SignalFunction  αj = 1/xm is the slope parameter of the function  Figure plotted for xm = 2 and αj = 0.5. .
  • 18.
    18/48 Linear Threshold SignalFunction  Sj (xj ) = max(0, min(αjxj , 1))  Note that in this course we assume that neurons within a network are homogeneous.
  • 19.
    19/48 Sigmoidal Signal Function λj is a gain scale factor  In the limit, as λj →∞ the smooth logistic function approaches the non-smooth binary threshold function.
  • 20.
    20/48 Sigmoidal Signal Function The sigmoidal signal function has some very useful mathematical properties. It is  monotonic  continuous  bounded
  • 21.
    21/48 Gaussian Signal Function σj is the Gaussian spread factor and cj is the center.  Varying the spread makes the function sharper or more diffuse.
  • 22.
    22/48 Gaussian Signal Function Changing the center shifts the function to the right or left along the activation axis  This function is an example of a non- monotonic signal function
  • 23.
    23/48 Stochastic Neurons  Thesignal is assumed to be two state  sj ∈ {0, 1} or {−1, 1}  Neuron switches into these states depending upon a probabilistic function of its activation, P(xj ).
  • 24.
  • 25.
    25/48 Neural Networks Defined Artificial neural networks are massively parallel adaptive networks of simple nonlinear computing elements called neurons which are intended to abstract and model some of the functionality of the human nervous system in an attempt to partially capture some of its computational strengths.
  • 26.
    26/48 Eight Components ofNeural Networks  Neurons. These can be of three types:  Input: receive external stimuli  Hidden: compute intermediate functions  Output: generate outputs from the network  Activation state vector. This is a vector of the activation level xi of individual neurons in the neural network,  X = (x1, . . . , xn)T ∈ Rn.
  • 27.
    27/48 Eight Components ofNeural Networks  Signal function. A function that generates the output signal of the neuron based on its activation.  Pattern of connectivity. This essentially determines the inter-neuron connection architecture or the graph of the network. Connections which model the inter-neuron synaptic efficacies, can be  excitatory (+)  inhibitory (−)  absent (0).
  • 28.
    28/48 Eight Components ofNeural Networks  Activity aggregation rule. A way of aggregating activity at a neuron, and is usually computed as an inner product of the input vector and the neuron fan-in weight vector.
  • 29.
    29/48 Eight Components ofNeural Networks  Activation rule. A function that determines the new activation level of a neuron on the basis of its current activation and its external inputs.  Learning rule. Provides a means of modifying connection strengths based both on external stimuli and network performance with an aim to improve the latter.
  • 30.
    30/48 Eight Components ofNeural Networks  Environment. The environments within which neural networks can operate could be  deterministic (noiseless) or  stochastic (noisy).
  • 31.
    31/48 Architectures: Feedforward and Feedback Local groups of neurons can be connected in either,  a feedforward architecture, in which the network has no loops, or  a feedback (recurrent) architecture, in which loops occur in the network because of feedback connections.
  • 32.
  • 33.
    33/48 Neural Networks GenerateMappings  Multilayered networks that associate vectors from one space to vectors of another space are called heteroassociators.  Map or associate two different patterns with one another—one as input and the other as output. Mathematically we write, f : Rn → Rp. f : Rn → Rp
  • 34.
    34/48 Neural Networks GenerateMappings  When neurons in a single field connect back onto themselves the resulting network is called an autoassociator since it associates a single pattern in Rn with itself. f : Rn → Rn
  • 35.
    35/48 Activation and SignalState Spaces  For a p-dimensional field of neurons, the activation state space is Rp.  The signal state space is the Cartesian cross space,  Ip = [0, 1]×· · ·×[0, 1] p times = [0, 1]p ⊂ Rp if the neurons have continuous signal functions in the interval [0, 1]  [−1, 1]p if the neurons have continuous signal functions in the interval [−1, 1].
  • 36.
    36/48 Activation and SignalState Spaces  For the case when the neuron signal functions are binary threshold, the signal state space is  Bp = {0, 1}×· · ·×{0, 1} p times = {0, 1}p ⊂ Ip ⊂ Rp  {−1, 1}p when the neuron signal functions are bipolar threshold.
  • 37.
    37/48 Feedforward vs Feedback: MultilayerPerceptrons  Organized into different layers  Unidirectional connections  memory-less: output depends only on the present input X ∈ Rn S = f (X)
  • 38.
    38/48 Feedforward vs Feedback: MultilayerPerceptrons  Possess no dynamics  Demonstrate powerful properties  Universal function approximation  Find widespread applications in pattern classification. X ∈ Rn S = f (X)
  • 39.
    39/48 Feedforward vs Feedback: RecurrentNeural Networks  Non-linear dynamical systems  New state of the network is a function of the current input and the present state of the network
  • 40.
    40/48 Feedforward vs Feedback: RecurrentNeural Networks  Possess a rich repertoire of dynamics  Capable of performing powerful tasks such as  pattern completion  topological feature mapping  pattern recognition
  • 41.
    41/48 More on FeedbackNetworks  Network activations and signals are in a flux of change until they settle down to a steady value  Issue of Stability: Given a feedback network architecture we must ensure that the network dynamics leads to behavior that can be interpreted in a sensible way.
  • 42.
    42/48 More on FeedbackNetworks  Dynamical systems have variants of behavior like  fixed point equilibria where the system eventually converges to a fixed point  Chaotic dynamics where the system wanders aimless in state space
  • 43.
    43/48 Summary of MajorNeural Networks Models
  • 44.
    44/48 Summary of MajorNeural Networks Models
  • 45.
    45/48 Salient Properties ofNeural Networks  Robustness Ability to operate, albeit with some performance loss, in the event of damage to internal structure.  Associative Recall Ability to invoke related memories from one concept.  For e.g. a friend’s name elicits vivid mental pictures and related emotions
  • 46.
    46/48 Salient Properties ofNeural Networks  Function Approximation and Generalization Ability to approximate functions using learning algorithms by creating internal representations and hence not requiring the mathematical model of how outputs depend on inputs. So neural networks are often referred to as adaptive function estimators.
  • 47.
    47/48 Application Domains ofNeural Networks Associative Recall Fault Tolerance
  • 48.
    48/48 Application Domains ofNeural Networks Function Approximation Control Prediction