Project Report
Anime face generation through DCGAN
B.Tech and M.Tech
in
Mathematics and Data Science
Presented by : Under the Supervision of :
Lavkesh Sharma ( 214104025) Dr. Phuspendra Kumar
Yuvraj Singh (214104011)
• So what are GANs?
• What makes them so "interesting" ?
Use case of GAN
Generative Adversarial Networks
• GAN - Generative Adversarial Network
• GANs belong to the set of generative models.
• It means that they are able to produce / generate new
content.
• How to train this Generative Network?
• One Solution – GAN
• Generative - Generates new content
• Adversarial Networks - Two networks with opposing
objectives
• It consists of two deep networks:
1. Discriminator network
2. Generator network
• The discriminator is supplied with both true and fake
images and it tries to detect fake vs true images. Outputs
a probability between 0 and 1 that the image is "true“
• The generator tries to fool the discriminator into thinking
that the fake images it generates are true images
How DCGAN is different from GAN
GAN is a more general idea. It can be applied to various data types like text, music, or
even code. DCGAN, on the other hand, is specifically designed for working with images.
•Building Blocks: Traditional GANs use fully-connected layers, which treat all data
points equally. DCGANs leverage convolutional layers, which are adept at capturing
spatial relationships between data points in images. This is crucial for tasks like image
recognition and generation.
•Training: DCGANs often incorporate techniques like batch normalization and
transposed convolutional layers to make the training process more stable and improve
the quality of the generated images.
Discriminator
• The discriminator in a DCGAN is simply a classifier. It tries to distinguish
real data from the data created by the generator.
• The discriminator uses convolutional layers with a larger stride to down-sample the
input image. This process reduces the spatial dimensions of the input while
increasing the number of channels.
• It helps extract features at different scales and levels of abstraction.
Generator
• The input to the generator is typically a vector or a matrix of random numbers
(referred to as a latent tensor) which is used as a seed for generating an image.
• The generator will convert a latent tensor of shape into an image tensor of
shape.
• To acheive this, we have used the ConvTranspose2d layer , which performs to
as a transposed convolution.
• Transpose convolutional layers increase the spatial dimensions of the data
while decreasing the number of channels. They are essential for generating
high-resolution images from lowdimensional noise vectors.
Discriminator training
The discriminator's training data comes from two sources:
•Real data instances, such as real pictures of people. The discriminator uses
these instances as positive examples during training.
•Fake data instances created by the generator. The discriminator uses these
instances as negative examples during training.
The discriminator connects to two loss functions. During discriminator
training, the discriminator ignores the generator loss and just uses the
discriminator loss. We use the generator loss during generator training
• The discriminator classifies both real data and fake data from the generator.
• The discriminator loss penalizes the discriminator for misclassifying a real instance
as fake or a fake instance as real.
• The discriminator updates its weights through backpropagation from the
discriminator loss through the discriminator network.
Backpropagation in discriminator training.
Generator training
• The generator part of a GAN learns to create fake data by incorporating feedback
from the discriminator. It learns to make the discriminator classify its output as real.
• Generator training requires tighter integration between the generator and the
discriminator than discriminator training requires. The portion of the GAN that
trains the generator includes:
Random input
Generator network, which transforms the random input into a data instance
Discriminator network, which classifies the generated data
Discriminator output
Generator loss, which penalizes the generator for failing to fool the discriminator
Backpropagation in generator training.
Full training loop
• The generator and the discriminator have different training processes.
• GAN training proceeds in alternating periods:
The discriminator trains for one or more epochs.
The generator trains for one or more epochs.
• We keep the generator constant during the discriminator training phase. As discriminator
training tries to figure out how to distinguish real data from fake, it has to learn how to
recognize the generator's flaws.
• Similarly, we keep the discriminator constant during the generator training phase.
Otherwise the generator would be trying to hit a moving target and might never converge.
• It's this back and forth that allows GANs to tackle otherwise intractable generative problems.
Loss function
Minimax Loss
• The generator tries to minimize the following function while the discriminator tries to
maximize it:
𝐸𝑥[ ( ( ))]+ [ (1− ( ( )))]
𝑙𝑜𝑔 𝐷 𝑥 𝐸𝑧 𝑙𝑜𝑔 𝐷 𝐺 𝑧
In this function:
• D(x) is the discriminator's estimate of the probability that real data instance x is real.
• Ex
is the expected value over all real data instances.
• G(z) is the generator's output when given noise z.
• D(G(z)) is the discriminator's estimate of the probability that a fake instance is real.
• Ez
is the expected value over all random inputs to the generator
(expected value over all generated fake instances G(z)).
Results
Input size 10,000 images Output at 5 epochs
Output at 1 epoch of input size 60,000 images
References
• https://machinelearningmastery.com/impressive-applications-
of-generative-adversarial-networks/
• https://developers.google.com/machine-learning/gan/generat
or
• https://medium.com/@dnyaneshwalwadkar/generative-
adversarial-network-gan-simplegan-dcgan-wgan-progan-
c92389a3c454#:~:text=A%20simple%20GAN%20typically
%20uses,quality%20of%20the%20generated%20images.
Anime_face_generation_through_DCGAN.pptx

Anime_face_generation_through_DCGAN.pptx

  • 1.
    Project Report Anime facegeneration through DCGAN B.Tech and M.Tech in Mathematics and Data Science Presented by : Under the Supervision of : Lavkesh Sharma ( 214104025) Dr. Phuspendra Kumar Yuvraj Singh (214104011)
  • 2.
    • So whatare GANs? • What makes them so "interesting" ?
  • 3.
  • 4.
    Generative Adversarial Networks •GAN - Generative Adversarial Network • GANs belong to the set of generative models. • It means that they are able to produce / generate new content.
  • 5.
    • How totrain this Generative Network? • One Solution – GAN • Generative - Generates new content • Adversarial Networks - Two networks with opposing objectives
  • 6.
    • It consistsof two deep networks: 1. Discriminator network 2. Generator network • The discriminator is supplied with both true and fake images and it tries to detect fake vs true images. Outputs a probability between 0 and 1 that the image is "true“ • The generator tries to fool the discriminator into thinking that the fake images it generates are true images
  • 7.
    How DCGAN isdifferent from GAN GAN is a more general idea. It can be applied to various data types like text, music, or even code. DCGAN, on the other hand, is specifically designed for working with images. •Building Blocks: Traditional GANs use fully-connected layers, which treat all data points equally. DCGANs leverage convolutional layers, which are adept at capturing spatial relationships between data points in images. This is crucial for tasks like image recognition and generation. •Training: DCGANs often incorporate techniques like batch normalization and transposed convolutional layers to make the training process more stable and improve the quality of the generated images.
  • 8.
    Discriminator • The discriminatorin a DCGAN is simply a classifier. It tries to distinguish real data from the data created by the generator. • The discriminator uses convolutional layers with a larger stride to down-sample the input image. This process reduces the spatial dimensions of the input while increasing the number of channels. • It helps extract features at different scales and levels of abstraction.
  • 10.
    Generator • The inputto the generator is typically a vector or a matrix of random numbers (referred to as a latent tensor) which is used as a seed for generating an image. • The generator will convert a latent tensor of shape into an image tensor of shape. • To acheive this, we have used the ConvTranspose2d layer , which performs to as a transposed convolution. • Transpose convolutional layers increase the spatial dimensions of the data while decreasing the number of channels. They are essential for generating high-resolution images from lowdimensional noise vectors.
  • 13.
    Discriminator training The discriminator'straining data comes from two sources: •Real data instances, such as real pictures of people. The discriminator uses these instances as positive examples during training. •Fake data instances created by the generator. The discriminator uses these instances as negative examples during training. The discriminator connects to two loss functions. During discriminator training, the discriminator ignores the generator loss and just uses the discriminator loss. We use the generator loss during generator training
  • 14.
    • The discriminatorclassifies both real data and fake data from the generator. • The discriminator loss penalizes the discriminator for misclassifying a real instance as fake or a fake instance as real. • The discriminator updates its weights through backpropagation from the discriminator loss through the discriminator network.
  • 15.
  • 16.
    Generator training • Thegenerator part of a GAN learns to create fake data by incorporating feedback from the discriminator. It learns to make the discriminator classify its output as real. • Generator training requires tighter integration between the generator and the discriminator than discriminator training requires. The portion of the GAN that trains the generator includes: Random input Generator network, which transforms the random input into a data instance Discriminator network, which classifies the generated data Discriminator output Generator loss, which penalizes the generator for failing to fool the discriminator
  • 17.
  • 18.
    Full training loop •The generator and the discriminator have different training processes. • GAN training proceeds in alternating periods: The discriminator trains for one or more epochs. The generator trains for one or more epochs. • We keep the generator constant during the discriminator training phase. As discriminator training tries to figure out how to distinguish real data from fake, it has to learn how to recognize the generator's flaws. • Similarly, we keep the discriminator constant during the generator training phase. Otherwise the generator would be trying to hit a moving target and might never converge. • It's this back and forth that allows GANs to tackle otherwise intractable generative problems.
  • 20.
    Loss function Minimax Loss •The generator tries to minimize the following function while the discriminator tries to maximize it: 𝐸𝑥[ ( ( ))]+ [ (1− ( ( )))] 𝑙𝑜𝑔 𝐷 𝑥 𝐸𝑧 𝑙𝑜𝑔 𝐷 𝐺 𝑧 In this function: • D(x) is the discriminator's estimate of the probability that real data instance x is real. • Ex is the expected value over all real data instances. • G(z) is the generator's output when given noise z. • D(G(z)) is the discriminator's estimate of the probability that a fake instance is real. • Ez is the expected value over all random inputs to the generator (expected value over all generated fake instances G(z)).
  • 21.
    Results Input size 10,000images Output at 5 epochs
  • 22.
    Output at 1epoch of input size 60,000 images
  • 23.
    References • https://machinelearningmastery.com/impressive-applications- of-generative-adversarial-networks/ • https://developers.google.com/machine-learning/gan/generat or •https://medium.com/@dnyaneshwalwadkar/generative- adversarial-network-gan-simplegan-dcgan-wgan-progan- c92389a3c454#:~:text=A%20simple%20GAN%20typically %20uses,quality%20of%20the%20generated%20images.