Introduction to Autoencoders: Types and Applications

Introduction to
autoencoder
Prepared by : Dr. Amr Rashed

Agenda
• Introduction to Autoencoders
• What Are Autoencoders?
• How Autoencoders Achieve High-Quality
Reconstructions?
• Revisiting the Story
• Types of Autoencoder
• Vanilla Autoencoder
• Convolutional Autoencoder (CAE)
• Denoising Autoencoder
• Sparse Autoencoder
• Variational Autoencoder (VAE)
• Sequence-to-Sequence Autoencoder

Cont .
• What Are the Applications of
Autoencoders?
• Dimensionality Reduction
• Feature Learning
• Anomaly Detection
• Denoising Images
• Image Inpainting
• Generative Modeling
• Recommender Systems
• Sequence-to-Sequence Learning
• Image Segmentation
• How Are Autoencoders Different from
GANs?
• Architecture
• Training Process
• Objectives

Introduction to autoencoder
• To develop an understanding of Autoencoders and the problem
they aim to solve, let’s consider
• Story set in the city of Fashionopolis.
• Clothing items organized in a wide, tall closet.
• Fashion consultant Alex proposes the strategy.
• Requesting specific items by informing Alex of their position.
• Fine-tuning the closet's arrangement for accuracy.
• Alex can reproduce clothing items impeccably.
• Possibility of creating brand-new clothing items in empty spots.
• Infinite possibilities for generating novel clothing.

What Are
Autoencode
rs?
• An autoencoder is an artificial neural
network used for unsupervised learning
tasks (i.e., no class labels or labeled data)
such as dimensionality reduction, feature
extraction, and data compression. They
seek to:
• Accept an input set of data (i.e., the input)
• Internally compress the input data into a
latent space representation (i.e., a single
vector that compresses and quantifies the
input)
• Reconstruct the input data from this
latent representation (i.e., the output)

Cont.
purposes to discover underlying correlations among data and represent data in a smaller
dimension
.
The autoencoders frame unsupervised learning problems as supervised learning problems to
train a neural network model
.
The input only is passed at the output. The input is squeezed down o a lower encoded
representation using an encoder network, then a decoder network decodes the encoding to
recreate back the input
.
The encoding produced by the encoder layer has a lower-dimensional representation of the
data and shows several interesting complex relationships among data
.

Cont.
An Autoencoder has the following parts:
1. Encoder: The encoder is the part of the
network that takes in the input and produces
a lower-dimensional encoding
2. Bottleneck: It is the lower dimensional
hidden layer where the encoding is produced.
The bottleneck layer has a lower number of
nodes and the number of nodes in the
bottleneck layer also gives the dimension of
the encoding of the input.
3. Decoder: The decoder takes in the encoding
and recreates back the input.

Cont.
The bottleneck layer is the lower dimension layer. In the diagram, we have the neural networks
encoder and decoder. Phi and Theta are the representing parameters of the encoder and
decoder respectively
.
The target of this model is such that the Input is equivalent to the Reconstructed Output
.
To achieve this we minimize a loss function named Reconstruction Loss
.
Basically, Reconstruction Loss is given by the error between the input and the reconstructed
output
.
It is usually given by the Mean Square error or Binary Cross entropy between the input and
reconstructed output
.
Binary Cross entropy is used if the data is binary
.

Cont.
• An autoencoder consists of the following two primary
components:
• Encoder: The encoder compresses input data into a
lower-dimensional representation known as the latent
space or code. This latent space, often called embedding,
aims to retain as much information as possible, allowing
the decoder to reconstruct the data with high precision. If
we denote our input data as x and the encoder as E, then
the output latent space representation, s, would be s=E(x).
• Decoder: The decoder reconstructs the original input
data by accepting the latent space representation s. If we
denote the decoder function as D and the output of the
detector as o, then we can represent the decoder as o =
D(s).
• Both encoder and decoder are typically composed of one
or more layers, which can be fully connected,
convolutional, or recurrent, depending on the input data’s
nature and the autoencoder’s architecture.

Revisiting
the Story
• With the story analogy fresh in our
minds, let’s connect it to the technical
concept of Autoencoders. In the story,
you act as the encoder, organizing
each clothing item into a specific
location within the wardrobe and
assigning an encoding.
• Meanwhile, your friend Alex takes on
the role of the decoder, selecting a
location in the wardrobe and
attempting to recreate (or, in technical
terms) the clothing item (a process
referred to as decoding).

Autoencoder vs PCA
We have to keep in mind that the reason behind using an autoencoder is that we want to understand and
represent only the deep correlations and relationships among data
.
We need a generalized lower-dimensional representation
.
That is why, if the features of the data are not correlated at all then it is hard for an autoencoder to
represent the data in a lower dimension
.
If while designing the neural network, we use a very large number of nodes in the bottleneck layer, it will
create a large dimensional encoding
.
The problem that exists here is, the network might cheat and overfit the input data by simply remembering
the input data
.
In this case, we will not be able to get the correct relationships in our encodings. Again, if we use a shallow
network with a very small number of nodes, it will be very hard to capture all the relationships. So, we must
be very careful during designing the network
.

Cont.
• PCA or principal component analysis tries to find
lower-dimensional orthogonal hyperplanes that
describe the original data by capturing the
maximum possible variance in the data and the
important correlations consequently.
• We need to focus on the fact that we are talking
about finding a hyperplane, so it's linear. But
often correlations are non-linear, which are not
covered by PCA.
• As we can see in the above diagram
autoencoders cover non-linear data
dependencies and, thus are a better way than
PCA for dimensionality reduction.

• Note: In fact, if we were to construct a linear network (ie.
without the use of nonlinear activation functions at each layer)
we would observe a similar dimensionality reduction as
observed in PCA. See Geoffrey Hinton's discussion of this here.

Types of
Autoencoder
• There are several types of
autoencoders, each with its unique
properties and use cases.
• 1-Vanilla Autoencoder
• Figure 4 shows the simplest form of
an autoencoder, consisting of one or
more fully connected layers for both
the encoder and decoder. It works
well for simple data but may struggle
with complex patterns.

2-Convolutional Autoencoder (CAE)
• Utilizes convolutional layers in both the
encoder and decoder, making it
suitable for handling image data. By
exploiting the spatial information in
images, CAEs can capture complex
patterns and structures more
effectively than vanilla autoencoders
and accomplish tasks such as image
segmentation, as shown in Figure 5.

3-Denoising Autoencoder
• This autoencoder is designed to
remove noise from corrupted input
data, as shown in Figure 6. During
training, the input data is intentionally
corrupted by adding noise, while the
target remains the original,
uncorrupted data.
• The autoencoder learns to reconstruct
the clean data from the noisy input,
making it useful for image denoising
and data preprocessing tasks.

4-Sparse Autoencoder
• This type of autoencoder enforces
sparsity in the latent space
representation by adding a sparsity
constraint to the loss function (as
shown in Figure 7). This constraint
encourages the autoencoder to
represent the input data using a small
number of active neurons in the latent
space, resulting in more efficient and
robust feature extraction.

5-Variational Autoencoder (VAE)
• Figure 8 shows a generative model that
introduces a probabilistic layer in the
latent space, allowing for sampling and
generation of new data.
• VAEs can generate new samples from
the learned latent distribution, making
them ideal for image generation and
style transfer tasks.

6-Sequence-to-Sequence
Autoencoder
• Also known as a Recurrent
Autoencoder, this type of autoencoder
utilizes recurrent neural network (RNN)
layers (e.g., long short-term memory
(LSTM) or gated recurrent unit (GRU)) in
both the encoder and decoder shown
in Figure 9.
• This architecture is well-suited for
handling sequential data (e.g., time
series or natural language processing
tasks).

Implementation of autoencoder
• Autoencoders are powerful neural networks with diverse
applications in unsupervised learning, including dimensionality
reduction, feature extraction, and data compression
• To implement an autoencoder :
• 1- you would typically define two separate modules for the encoder
and decoder, and then combine them in a higher-level module.
• 2- You then train the autoencoder using backpropagation and
gradient descent, minimizing the reconstruction error.

What Are the Applications of
Autoencoders?
• 1-Dimensionality Reduction
• Autoencoders can reduce the dimensionality of input data by learning a compact
and efficient representation in the latent space. This can be helpful for visualization,
data compression, and speeding up other machine learning algorithms.
• 2-Feature Learning
• Autoencoders can learn meaningful features from input data, which can be used for
downstream machine-learning tasks like classification, clustering, or regression.
• 3-Anomaly Detection
• By training an autoencoder on normal data instances, it can learn to reconstruct
those instances with low error. When presented with an anomalous data point, the
autoencoder will likely have a higher reconstruction error, which can be used to
identify outliers or anomalies.

Anomaly Detection
• Autoencoders are used largely for anomaly detection: As we
know, autoencoders create encodings that basically captures the
relationship among data.
• Now, if we train our autoencoder on a particular dataset, the
encoder and decoder parameters will be trained to represent the
relationships on the datasets the best way.
• Thus, we will be able to recreate any data given from that kind of
dataset in the best way.
• So, if data from that particular dataset is sent through the
autoencoder, the reconstruction error is less.
• But if some other kind of data is sent through the autoencoder it
will generate a huge reconstruction error. If we can apply a correct
cutoff we will be able to create an anomaly detector.

Cont.
• 4-Denoising Images
• Autoencoders can be trained to reconstruct
clean input data from noisy versions. The
denoising autoencoder learns to remove the
noise and produce a clean version of the input
data.
• 5-Image Inpainting
• As shown in Figure 10, autoencoders can fill in
missing or corrupted parts of an image by
learning the underlying structure and patterns
in the data.

Denoising Images
• Autoencoders are used for Noise Removal: If we can pass the
noisy data as input and clean data as output and train an
autoencoder on such given data pairs, trained Autoencoders
can be highly useful for noise removal.
• This is because noise points usually do not have any
correlations. Now, as the autoencoders need to represent the
data in the lowest dimensions, the encodings usually have only
the important relations there exists, rejecting the random ones.
• So, the decoded data coming out as output of an autoencoder
is free of all the extra relations and hence the noise.

Cont.
• 6-Generative Modeling
• Variational autoencoders (VAEs) and other generative variants can generate new, realistic data
samples by learning the data distribution during training.
• This can be useful for data augmentation or creative applications.
• Before the GAN’s came into existence, Autoencoders were used as generative models. One of
the modified approaches of autoencoders, variational autoencoders are used for generative
purposes.
• 7-Recommender Systems
• Autoencoders can be used to learn latent representations of users and items in a recommender
system, which can then predict user preferences and make personalized recommendations.
• 8-Sequence-to-Sequence Learning
• Autoencoders can be used for sequence-to-sequence tasks, such as machine translation or text
summarization, by adapting their architecture to handle sequential input and output data.

Cont.
• 9-Image Segmentation
• Autoencoders are commonly utilized
in semantic segmentation as well.
• A notable example is SegNet (Figure
11), a model designed for pixel-wise,
multi-class segmentation on urban
road scene datasets.
• This model was created by researchers
from the University of Cambridge’s
Computer Vision Group.

How Are
Autoencode
rs Different
from GANs?
For those familiar with Generative Adversarial
Networks (GANs), you might wonder how
Autoencoders, which also learn to generate images,
differ from GANs
.
While both Autoencoders and GANs are generative
models, they differ in
:
1
-
Architecture
2
-
Training Process
3
-
Objectives

Architectu
re
Autoencoders consist of an encoder
and a decoder. The encoder
compresses the input data into a
lower-dimensional latent
representation while the decoder
reconstructs the input data from the
latent representation
.
GANs consist of a generator and a
discriminator. The generator creates
fake samples from random noise, and
the discriminator tries to differentiate
between real samples from the
dataset and fake samples produced
by the generator
.

Training
Process
• Autoencoders are trained using a
reconstruction error, which measures the
difference between the input and
reconstructed data. The training process
aims to minimize this error.
• GANs use a two-player minimax game
during training. The generator tries to
produce samples that the discriminator
cannot distinguish from real samples,
while the discriminator tries to improve
its ability to differentiate between real
and fake samples. The training process
involves finding an equilibrium between
the two competing networks.

Objectives
• Autoencoders are primarily used for dimensionality
reduction, data compression, denoising, and representation
learning.
• GANs are primarily used for generating new, realistic
samples, often in the domain of images, but also in other
data types such as text, audio, and more.

Strength and weakness
While both autoencoders and GANs can generate new samples, they
have different strengths and weaknesses
.
Autoencoders generally produce smoother and more continuous
outputs
,
while GANs can generate more realistic and diverse examples but
may suffer from issues like mode collapse
.
The choice between the two models depends on the specific problem
and desired outcomes
.

References
• https://machinelearningmastery.com/autoencoder-for-classification/
• https://www.datacamp.com/tutorial/introduction-to-autoencoders
• https://towardsdatascience.com/understanding-autoencoders-with-an-example-a-step-by-step-tutorial-693c3a4e9836
• https://www.edureka.co/blog/autoencoders-tutorial/
• https://pyimagesearch.com/2020/02/17/autoencoders-with-keras-tensorflow-and-deep-learning/
• https://www.geeksforgeeks.org/auto-encoders/
• https://www.analyticsvidhya.com/blog/2021/07/image-denoising-using-autoencoders-a-beginners-guide-to-deep-
learning-project/
• https://compneuro.neuromatch.io/tutorials/Bonus_Autoencoders/student/Bonus_Tutorial3.html
• https://jaan.io/what-is-variational-autoencoder-vae-tutorial/
• https://lazyprogrammer.me/a-tutorial-on-autoencoders/
• https://medium.com/data-science-365/an-introduction-to-autoencoders-in-deep-learning-ab5a5861f81e
• https://www.simplilearn.com/tutorials/deep-learning-tutorial/what-are-autoencoders-in-deep-learning
• https://www.slideshare.net/harishr2301/autoencoder-141008858
• https://arxiv.org/abs/2201.03898
• https://www.analyticsvidhya.com/blog/2021/10/an-introduction-to-autoencoders-for-beginners/
• https://www.jeremyjordan.me/autoencoders/
• https://pyimagesearch.com/2023/07/10/introduction-to-autoencoders/
• https://towardsdatascience.com/introduction-to-autoencoders-7a47cf4ef14b

Introduction to Autoencoders: Types and Applications

More Related Content

What's hot

Similar to Introduction to Autoencoders: Types and Applications

More from Amr Rashed

Recently uploaded

Introduction to Autoencoders: Types and Applications

Editor's Notes