Generative Adversarial
Networks
Dai-­‐Hai	
  Nguyen
Bioinformatics	
  Center
Kyoto	
  University
28/04/2018 1
Outline
• Overview	
  of	
  generative	
  models
• Variational AutoEncoder (VAE)
• Generative	
  Adversarial	
  Networks	
  (GAN)
• GAN:	
  applications
• Conclusion
28/04/2018 2
Generative Models
28/04/2018 3
Supervised vs. Unsupervised Learning
• Supervised Learning
Data : (x, y)
x: data, y: label
Goal: Learn a function f to map x-> y
Tasks: Classification, Regression, Detection, etc
http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture13.pdf
28/04/2018 4
Supervised vs. Unsupervised Learning
• Unsupervised Learning
Data : only data x, no label y
Goal: Learn some underlying hidden structures of data
Tasks: clustering, dim reduction, density estimation, etc.
28/04/2018 5
Unsupervised Learning
• Taxonomy tree of unsupervised leanring
Source:	
  https://allenlu2007.wordpress.com/2018/01/10/variational-autoencoder-的原理/
28/04/2018 6
Generative models
• Goal:
Given training samples, generate new samples from the
same distribution
Training data ~ 𝑝"#$#(𝑥) Generated samples ~ 𝑝()"*+(𝑥)
In other words, try to learn a model 𝑝()"*+(𝑥) similar to 𝑝"#$#(𝑥)
28/04/2018 7
Generative models
• Maximum Likelihood Estimation (MLE):
Given training samples 𝑥,, 𝑥.,…, 𝑥/, how to learn 𝑝()"*+ 𝑥; 𝜃 from
which training samples are likely to be generated
𝜃∗
= 𝑎𝑟𝑔𝑚𝑎𝑥8 9 log	
   𝑝()"*+(𝑥>; 𝜃)
/
>?,
28/04/2018 8
Unsupervised Learning
• RECAP
Source:	
  https://allenlu2007.wordpress.com/2018/01/10/variational-autoencoder-的原理/
28/04/2018 9
Variational Auto-Encoder (VAE)
28/04/2018 10
Variational Autoencoder
• (probabilistic) generative model to generate samples from latent
variable.
• Assumption: training data {𝑥,, 𝑥.,…, 𝑥/} is generated from latent
variable 𝑧
Sample 𝑥~𝑝8(𝑥|𝑧)
Sample z~𝑝(𝑧)
Vary	
  z1
Vary	
  z2
Example:	
  
Samples	
  x	
  are	
  face	
  images
Latent	
  z	
  is	
  2d	
  vector:
Z1:	
  head	
  orientation
Z2	
  :	
  degree	
  of	
  smile
28/04/2018 11
Variational Autoencoder
• How to learn the model?
MLE again !
𝜃∗
= 𝑎𝑟𝑔𝑚𝑎𝑥8 9 log 𝑝8(𝑥>)
/
>?,
Where 𝑝8 𝑥 = ∫ 𝑝8 𝑥 𝑧 𝑝 𝑧 𝑑𝑧E
-> intractable to compute
• Solution: Variational Approximation
28/04/2018 12
Variational Autoencoder
• Variational approximation
log 𝑝8(𝑥) can be written as the following formulation:
Likelihood	
   term	
  to	
  quantifyhow
good	
  	
  the	
  sample	
  is	
  
reconstructed	
  from	
  z.	
  This	
  can	
  
be	
  estimated	
  by	
  a	
  network.
KL	
  divergence	
  term	
  to	
  estimate	
  the	
  
difference	
  between	
  two	
  distribution
This	
  has	
  good	
  form	
  if	
  both	
  of	
  
distributions	
   are	
  Gaussian-­‐>	
  easy	
  to	
  
estimate
This	
  KL	
  divergence	
  term	
  is	
  intractable	
  
because	
  p(z|x)	
  cannot	
  computed.
But	
  it	
  is	
  aways >=	
  0
28/04/2018 13
Variational Autoencoder
• Variational approximation
log 𝑝8(𝑥) can be written as the following formulation:
Tractable	
  lower	
  bound	
  (ELBO)
28/04/2018 14
Variational Autoencoder
• Variational approximation
log 𝑝8(𝑥) can be written as the following formulation:
Tractable	
  lower	
  bound	
  (ELBO) Strategy:
• Maxmizing ELBO instead of intractable logp(x)
• What to be modeled:
1. 𝑝8(𝑥|𝑧) by a network (decoder)
2. 𝑞J 𝑥 𝑥 by another network (encoder)
28/04/2018 15
Variational Autoencoder: model
q training
Make q(z|x)
close to p(z)Minimize
reconstruction
error
28/04/2018 16
Variational Autoencoder: model
q Sampling: use decoder network only and sample z from prior
Sample	
  z	
  from	
  N(0,	
  I)
28/04/2018 17
Variational Autoencoder: generated
samples
28/04/2018 18
Reminder
• Taxonomy tree of unsupervised leanring
Source:	
  https://allenlu2007.wordpress.com/2018/01/10/variational-autoencoder-的原理/
28/04/2018 19
Generative Adversarial Network (GAN)
28/04/2018 20
Generative Adversarial Network: Idea
Key	
  points:
q Belongs	
  to	
  “Implicit	
  density”	
  group	
  and
“hot”	
  method	
  in	
  ML	
  by	
  Goodfellow
q Motivated	
  by	
  game	
  theory
q Two	
  players:
1. Generator	
  tries	
  to	
  generate	
  “fake”	
  
samples	
  from	
  its	
  model
2. Discriminator	
  tries	
  to	
  distinguish	
  
“fake”	
  and	
  “real”	
  samples
28/04/2018 21
GAN: Two player game
Model:
q Generator	
  network:	
  try	
  to	
  fool	
  the	
  discriminator	
  by	
  generating	
  “like-­‐‑real”	
  
images
qDiscriminator	
  network:	
  try	
  to	
  distinguish	
  real	
  and	
  fake	
  samples
28/04/2018 22
GAN: Two player game
Objective	
  fucntion:
Loss	
  for	
  real	
  data	
  x Loss	
  for	
  fake	
  data	
  x
How	
  this	
  work
-­‐‑ D	
  tries	
  to	
  maximize	
  the	
  cost	
  such	
  that	
  D(x)	
  close	
  to	
  1	
  (for	
  real	
  x)	
  and	
  D(G(z))	
  
close	
  to	
  0	
  (fake)
-­‐‑ G	
  tries	
  to	
  minimize	
  the	
  cost	
  such	
  that	
  D(G(z))	
  is	
  close	
  to	
  1
(try	
  to	
  make	
  generated	
  samples	
  real-­‐‑looking	
  as	
  much	
  as	
  possible,	
  to	
  fool	
  D)
28/04/2018 23
GAN: Two player game
Objective	
  fucntion:
Loss	
  for	
  real	
  data	
  x Loss	
  for	
  fake	
  data	
  x
How	
  to	
  train:	
  alternative	
  approach
-­‐‑ Fix	
  G,	
  D	
  maximize	
  the	
  cost
-­‐‑ Fix	
  D,	
  G	
  minimize	
  the	
  cost
28/04/2018 24
GAN: density ratio estimation
Density	
  estimation	
  via	
  density	
  ratio	
  estimation:
28/04/2018 25
Generative Adversarial Network: Result
Some	
  generated	
  samples
28/04/2018 26
GAN: Applications
28/04/2018 27
GAN: applications
Image-­‐‑to-­‐‑Image	
  translation
Goal:	
  learn	
  a	
  mapping	
  from	
  
input	
  image-­‐‑>output	
  image
Ref:	
  	
  Image-­‐to-­‐Image	
  Translation	
  with	
  Conditional	
  Adversarial	
  Networks,	
  CVPR201628/04/2018 28
GAN: applications
Image-­‐‑to-­‐‑Image	
  translation
Generator:	
  
• In:	
  noise	
  +	
  input	
  image;	
  Out:	
  sample
Discriminator:	
  
§ In:	
  pairs	
  of	
  in/out	
  images;	
  Out:	
  fake/real
Optimization:
Where
Encourage	
  less	
  blurring
Similar	
  to	
  original	
  GAN
28/04/2018 29
GAN: applications
Image-­‐‑to-­‐‑Image	
  translation
Result:
Ref:	
  	
  Image-­‐to-­‐Image	
  Translation	
  with	
  Conditional	
  Adversarial	
  Networks,	
  CVPR201628/04/2018 30
GAN: applications
Text-­‐‑to-­‐‑Image	
  translation
Goal:	
  learn	
  a	
  mapping	
  from	
  
input	
  text>output	
  image
Ref:	
  	
  Generative	
  Adversarial	
  Text	
  to	
  Image	
  Synthesis,	
  ICML201628/04/2018 31
GAN: applications
Text-­‐‑to-­‐‑Image	
  translation
Generator:	
  
• In:	
  noise	
  +	
  text	
  ;	
  Out:	
  image
Discriminator:	
  
§ In:	
  pairs	
  of	
  text/	
  images;	
  Out:	
  fake/real
Model:
28/04/2018 32
GAN: applications
Text-­‐‑to-­‐‑Image	
  translation
Result:
Ref:	
  	
  Generative	
  Adversarial	
  Text	
  to	
  Image	
  Synthesis,	
  ICML201628/04/2018 33
Conclusion
• Taxonomy tree of unsupervised leanring, AGAIN!
Source:	
  https://allenlu2007.wordpress.com/2018/01/10/variational-autoencoder-的原理/
28/04/2018 34

Brief introduction on GAN

  • 1.
  • 2.
    Outline • Overview  of  generative  models • Variational AutoEncoder (VAE) • Generative  Adversarial  Networks  (GAN) • GAN:  applications • Conclusion 28/04/2018 2
  • 3.
  • 4.
    Supervised vs. UnsupervisedLearning • Supervised Learning Data : (x, y) x: data, y: label Goal: Learn a function f to map x-> y Tasks: Classification, Regression, Detection, etc http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture13.pdf 28/04/2018 4
  • 5.
    Supervised vs. UnsupervisedLearning • Unsupervised Learning Data : only data x, no label y Goal: Learn some underlying hidden structures of data Tasks: clustering, dim reduction, density estimation, etc. 28/04/2018 5
  • 6.
    Unsupervised Learning • Taxonomytree of unsupervised leanring Source:  https://allenlu2007.wordpress.com/2018/01/10/variational-autoencoder-的原理/ 28/04/2018 6
  • 7.
    Generative models • Goal: Giventraining samples, generate new samples from the same distribution Training data ~ 𝑝"#$#(𝑥) Generated samples ~ 𝑝()"*+(𝑥) In other words, try to learn a model 𝑝()"*+(𝑥) similar to 𝑝"#$#(𝑥) 28/04/2018 7
  • 8.
    Generative models • MaximumLikelihood Estimation (MLE): Given training samples 𝑥,, 𝑥.,…, 𝑥/, how to learn 𝑝()"*+ 𝑥; 𝜃 from which training samples are likely to be generated 𝜃∗ = 𝑎𝑟𝑔𝑚𝑎𝑥8 9 log   𝑝()"*+(𝑥>; 𝜃) / >?, 28/04/2018 8
  • 9.
    Unsupervised Learning • RECAP Source:  https://allenlu2007.wordpress.com/2018/01/10/variational-autoencoder-的原理/ 28/04/2018 9
  • 10.
  • 11.
    Variational Autoencoder • (probabilistic)generative model to generate samples from latent variable. • Assumption: training data {𝑥,, 𝑥.,…, 𝑥/} is generated from latent variable 𝑧 Sample 𝑥~𝑝8(𝑥|𝑧) Sample z~𝑝(𝑧) Vary  z1 Vary  z2 Example:   Samples  x  are  face  images Latent  z  is  2d  vector: Z1:  head  orientation Z2  :  degree  of  smile 28/04/2018 11
  • 12.
    Variational Autoencoder • Howto learn the model? MLE again ! 𝜃∗ = 𝑎𝑟𝑔𝑚𝑎𝑥8 9 log 𝑝8(𝑥>) / >?, Where 𝑝8 𝑥 = ∫ 𝑝8 𝑥 𝑧 𝑝 𝑧 𝑑𝑧E -> intractable to compute • Solution: Variational Approximation 28/04/2018 12
  • 13.
    Variational Autoencoder • Variationalapproximation log 𝑝8(𝑥) can be written as the following formulation: Likelihood   term  to  quantifyhow good    the  sample  is   reconstructed  from  z.  This  can   be  estimated  by  a  network. KL  divergence  term  to  estimate  the   difference  between  two  distribution This  has  good  form  if  both  of   distributions   are  Gaussian-­‐>  easy  to   estimate This  KL  divergence  term  is  intractable   because  p(z|x)  cannot  computed. But  it  is  aways >=  0 28/04/2018 13
  • 14.
    Variational Autoencoder • Variationalapproximation log 𝑝8(𝑥) can be written as the following formulation: Tractable  lower  bound  (ELBO) 28/04/2018 14
  • 15.
    Variational Autoencoder • Variationalapproximation log 𝑝8(𝑥) can be written as the following formulation: Tractable  lower  bound  (ELBO) Strategy: • Maxmizing ELBO instead of intractable logp(x) • What to be modeled: 1. 𝑝8(𝑥|𝑧) by a network (decoder) 2. 𝑞J 𝑥 𝑥 by another network (encoder) 28/04/2018 15
  • 16.
    Variational Autoencoder: model qtraining Make q(z|x) close to p(z)Minimize reconstruction error 28/04/2018 16
  • 17.
    Variational Autoencoder: model qSampling: use decoder network only and sample z from prior Sample  z  from  N(0,  I) 28/04/2018 17
  • 18.
  • 19.
    Reminder • Taxonomy treeof unsupervised leanring Source:  https://allenlu2007.wordpress.com/2018/01/10/variational-autoencoder-的原理/ 28/04/2018 19
  • 20.
    Generative Adversarial Network(GAN) 28/04/2018 20
  • 21.
    Generative Adversarial Network:Idea Key  points: q Belongs  to  “Implicit  density”  group  and “hot”  method  in  ML  by  Goodfellow q Motivated  by  game  theory q Two  players: 1. Generator  tries  to  generate  “fake”   samples  from  its  model 2. Discriminator  tries  to  distinguish   “fake”  and  “real”  samples 28/04/2018 21
  • 22.
    GAN: Two playergame Model: q Generator  network:  try  to  fool  the  discriminator  by  generating  “like-­‐‑real”   images qDiscriminator  network:  try  to  distinguish  real  and  fake  samples 28/04/2018 22
  • 23.
    GAN: Two playergame Objective  fucntion: Loss  for  real  data  x Loss  for  fake  data  x How  this  work -­‐‑ D  tries  to  maximize  the  cost  such  that  D(x)  close  to  1  (for  real  x)  and  D(G(z))   close  to  0  (fake) -­‐‑ G  tries  to  minimize  the  cost  such  that  D(G(z))  is  close  to  1 (try  to  make  generated  samples  real-­‐‑looking  as  much  as  possible,  to  fool  D) 28/04/2018 23
  • 24.
    GAN: Two playergame Objective  fucntion: Loss  for  real  data  x Loss  for  fake  data  x How  to  train:  alternative  approach -­‐‑ Fix  G,  D  maximize  the  cost -­‐‑ Fix  D,  G  minimize  the  cost 28/04/2018 24
  • 25.
    GAN: density ratioestimation Density  estimation  via  density  ratio  estimation: 28/04/2018 25
  • 26.
    Generative Adversarial Network:Result Some  generated  samples 28/04/2018 26
  • 27.
  • 28.
    GAN: applications Image-­‐‑to-­‐‑Image  translation Goal:  learn  a  mapping  from   input  image-­‐‑>output  image Ref:    Image-­‐to-­‐Image  Translation  with  Conditional  Adversarial  Networks,  CVPR201628/04/2018 28
  • 29.
    GAN: applications Image-­‐‑to-­‐‑Image  translation Generator:   • In:  noise  +  input  image;  Out:  sample Discriminator:   § In:  pairs  of  in/out  images;  Out:  fake/real Optimization: Where Encourage  less  blurring Similar  to  original  GAN 28/04/2018 29
  • 30.
    GAN: applications Image-­‐‑to-­‐‑Image  translation Result: Ref:    Image-­‐to-­‐Image  Translation  with  Conditional  Adversarial  Networks,  CVPR201628/04/2018 30
  • 31.
    GAN: applications Text-­‐‑to-­‐‑Image  translation Goal:  learn  a  mapping  from   input  text>output  image Ref:    Generative  Adversarial  Text  to  Image  Synthesis,  ICML201628/04/2018 31
  • 32.
    GAN: applications Text-­‐‑to-­‐‑Image  translation Generator:   • In:  noise  +  text  ;  Out:  image Discriminator:   § In:  pairs  of  text/  images;  Out:  fake/real Model: 28/04/2018 32
  • 33.
    GAN: applications Text-­‐‑to-­‐‑Image  translation Result: Ref:    Generative  Adversarial  Text  to  Image  Synthesis,  ICML201628/04/2018 33
  • 34.
    Conclusion • Taxonomy treeof unsupervised leanring, AGAIN! Source:  https://allenlu2007.wordpress.com/2018/01/10/variational-autoencoder-的原理/ 28/04/2018 34