Deep Generative Modelling (updated)

Generative and Discriminative models
Autoencoders
Variational Autoencoders
Generative Adversarial Networks
Conditional Generative Models
Agenda

Generative Models
❏ Generates new random observable data, models the joint
distribution of all variables.
❏ Given some dataset D generate new samples like D, but not
the same.
❏ We need to adjust their hidden parameters
❏ Considered as branch of unsupervised learning, but they can
be used for tasks like classification

“What I cannot create, I do
not understand.”
—Richard Feynman

Motivation
❏ Tremendous amount of information out there in
the world
❏ Machines are good at solving specific tasks
❏ Better than humans in Object recognition, Speech
recognition, Tumour segmentation, Go
❏ Cannot build compact representations of the world :(

Intelligence Gap
Help ML models to learn very compact
and disentangled representations.

Disentangled factors
❏ P( X | Z), where X is an image, Z is a vector that
causes (explains) X
❏ We would like the dimensions of Z to describe
real world factors
❏ Z which has a separate dimension for lighting,
guitar, bookshelf , rotation will be considered
more disentangled than the raw pixels of X
❏ P(guitar | Z) can be easily computed with
Disentangled representation.

Applications
❏ Short term applications
❏ Image translation, denoising, super-resolution
❏ Domain Adaptation, Synthetic data generation
❏ Music, Audio and Text Generation
❏ Long term applications
❏ Understanding of the real world
❏ Artificial General Intelligence

Discriminative models
❏ ImageNet. Here y would be the vector of 1000
labels and x some image from the dataset.
❏ They are trying to maximize log P(y | x)
❏ Predictions obtained by argmax of yi : P(yi | x)
❏ Classification models are mostly discriminative
ones.

Generative Models
❏ During training maximize the probability log P(X)
❏ Generate new sampled images close to the real
distribution P(X*
)
❏ During inference for some image X depending on
the model you might be able to estimate the
probability of the image X under the model

Properties and Drawbacks of Discriminative Models
❏ Good at capturing statistical regularities of the data
❏ Find features invariant to characteristics you don’t care for the
task
❏ Object classification: Rotation, Translation, Lighting, Color
❏ Segmentation: You care for Rotation, Translation
❏ Having difficulties to build disentangled representations
❏ Adversarial examples are good example for that

Generation from Discriminative Model (Example)
Handwriting Model This is regarding my friend, Kate Zack
Gradient ascent on the input image X

Generation from Discriminative Model (Example)
Handwriting Model
P E T K O
X
Maximize

Generative Models
❏ Gaussian mixture model
❏ Hidden Markov model
❏ Naive Bayes
❏ Latent Dirichlet allocation
❏ … many others

Deep Generative Models
❏ Restricted Boltzmann Machines
❏ Variational Autoencoders
❏ PixelRNN, PixelCNN
❏ Generative Adversarial Networks
❏ Neural Language Models
❏ WaveNet

Deep Generative Model
Generator
Latent variables (code)

Autoencoder
Autoencoder
network
Loss = Pixelwise L2 or Softmax.

Autoencoders ● Latent variables
● Lower dimensional than the input
Encoder Decoder
Loss =

Autoencoders
❏ random latent code won’t get us anywhere
❏ Pass an image to the encoder to get “valid” code
Encoder Decoder

❏ Encoder-Decoder architecture
❏ Forcing the latent code to be Gaussian distributed
❏ Sample the latent code from the Gaussian and pass
it to the decoder network

Encoder Decoder
Mean
Sampled code

Variational Autoencoders - Samples
Input Output
❏ CIFAR-10
❏ Blurry images
❏ Good
approximation
of the
likelihood of
the input data

Deep Recurrent Attentive Writer
❏ Generates the image sequentially
❏ On each step the model decides where to focus and
draw
❏ Uses an attention mechanism to achieve it

Deep Recurrent Attentive Writer (DRAW)
❏ Google Street View Numbers
❏ The red rectangle is showing
where the model is attending on
the current step
❏ Impressive as DRAW is the first
successful model that generates
images sequentially

VAEs DRAW

Image source:http://kvfrans.com/what-is-draw-deep-recurrent-attentive-writer/

Deep Recurrent Attention Writer
Image source:http://kvfrans.com/what-is-draw-deep-recurrent-attentive-writer/

Fully Convolutional Model
❏ Typically using pre-trained classification network as
encoder
❏ Most often VGG-16, because it’s fast and has less
parameters
❏ Using transposed convolution layers as decoder
until we reach the desired shape
❏ Often the architecture of the encoder is the
transposed of the one of the decoder

Transposed Convolution (Deconvolution)
Parameters
❏ Kernel size = 3
❏ Stride = 1
Input layer
Output layer

Properties of Transposed Convolution
❏ During backpropagation a convolutional layer
becomes transposed convolution
❏ Checkerboard pattern might appear in the
generated image (sensitive to kernel and stride
sizes)
Odena, et al., "Deconvolution and Checkerboard
Artifacts", Distill, 2016. http://doi.org/10.23915

Pros of VAEs
❏ In practice, VAEs latent code dimensions are very
interpretable
❏ To achieve this it collapses some latent dimensions
and doesn’t use them
❏ Able to generate samples close to the data distribution

Drawbacks of VAEs
❏ Pixels in the L2 loss function are independent, which
leads to blurry images
❏ The exact probability of a generated image under the
model is intractable to compute
X - input image
Z - latent code

Generative Adversarial Networks
❏ A generative model invented by Ian Goodfellow in
2014
❏ Already widely adopted and an area of massive
research
❏ New GAN paper is published every week
❏ Has many awesome applications. We’ll see some of
them later on.
❏ GANs define the generative problem as an
adversarial game between two networks

Generative Adversarial Networks (GANs)
Generator
Discriminator
Sample
Real
Images
Sample
Real Fake
Loss

Discriminator Training of GANs
Generator
Discriminator
Sample
Real
Images
Sample
Real Fake
Classification Loss

Generator Training of GANs
Generator
Discriminator
Sample
Real
Images
Sample
Real Fake
Maximize

GANs Training is challenging
❏ Unstable during training
❏ Mode collapse
❏ Higher Log-likelihood != better samples
However, GAN training is getting easier. Checkout
Wassterstein GANs and LSGANs .

GAN Samples
Generative Adversarial Networks (Goodfellow et al., 2014)

Progressive Growing of GANs
Progressive Growing of GANs for Improved Quality, Stability, and Variation (Karras et al., 2018)

Conditional Generative Adversarial Networks
Real
Data
❏ Consists of pairs (X,Y)
❏ P(Y|X) : generate Y given X
pix2pix:Image-to-Image Translation with CANs, Isola et al 2016

Image to Image Translation (pix2pix)

Image to Image Translation (Example)

❏ X and Y are unpaired collections of images
❏ different domains of the same world
❏ Learn to translate image X into Y
CycleGAN
X Y

CycleGAN
Zhu et al 2017 (Unpaired Image-to-Image Translation using Cycle-Consistent GANs)

CycleGAN (failure cases)

CycleGAN
❏ Cycle Consistency Loss
❏ ||F(G(X)) - X||
❏ ||G(F(Y)) - Y||

Generating Images from Text
Generative Adversarial Text to Image Synthesis (Reed et al. 2016)

References
❏ https://blog.openai.com/generative-models
❏ http://distill.pub/2016/deconv-checkerboard/
❏ https://github.com/junyanz/CycleGAN
❏ https://github.com/phillipi/pix2pix
❏ http://videolectures.net/deeplearning2015_bengio_
generative_models/
❏ https://www.youtube.com/watch?v=P78QYjWh5sM
&spfreload=1
❏ http://image-net.org/explore
❏ http://kvfrans.com/what-is-draw-deep-recurrent-att
entive-writer/

HyperScience wants You for...
• Machine Learning Engineer
• For more info: goo.gl/wqPoqU
• Or https://www.hyperscience.com/careers
57

So the DL course is over… What’s next?
We have one word for you - MEETUPS!
❏ Various engineering topics
❏ Different cool locations
❏ Smart and interesting speakers from HyperScience and
fellow companies
Coming up in the fall...
Follow our Facebook page for more details and updates.

Let’s celebrate the success of this course
together
❏ We’d love to demo to you our technologies and
products - we’ll be doing two different demo
sessions simultaneously in this same hall;
❏ We are here to be asked all sort of questions -
happy to answer them all;
❏ And last but not least - the bar is open :)

Deep Generative Modelling (updated)

More Related Content

Similar to Deep Generative Modelling (updated)

Recently uploaded

Deep Generative Modelling (updated)