16-726 Assignment 3

Canbo Ye (canboy)

Overview

The goal of this project is to get hands-on experience coding and training GANs. In the project, I firstly implement a Deep Convolutional GAN (DCGAN) to generate grumpy cats from samples of random noise. Then I implement a more complex GAN architecture called CycleGAN to convert between different types of two kinds of cats (Grumpy and Russian Blue). In the end, I also try to evaluate the performance of CycleGAN on another dataset and also on higher resolution images.


Part 1: Deep Convolutional GAN

For the first part of this assignment, I implement a Deep Convolutional GAN (DCGAN).

Implement Data Augmentation

See function get_data_loader() in data_loader.py for implementation details. I compose [Resize, RandomCrop, RandomHorizontalFlip, ToTensor, Normalize] into a transform object.

Implement the Discriminator and Generator of the DCGAN and the Training Loop

See models.py for the implementation details of DCGAN model and see vanilla_gan.py for the training loop.
The calculation of the padding value is as follows:
$Output = [\frac{Input - Kernel + 2 * Padding}{Stride}]+1$
Let (Input = 2*Output, Kernel=4, Stride=2), we could get:
$Padding=1$

Experiments and Results

Curves of training loss:

Discriminator and Generator training loss with basic data Augmentation.

Discriminator and Generator training loss with deluxe data Augmentation.

If GAN manages to train, the training loss of both the Discriminator and Generator should go down to a small value. However, a small loss does not guarantee the quality of the generated images and we will need to check it by ourselves.
Generated images from different iterations:
Left: sample from iteration 200; Center: sample from iteration 1200; Right: sample from iteration 10000

During the early stage of training (200 iterations), the generated images will be just random noise. With more training epochs, the generated images start to capture the stucture and area colors of the target (1200 iterations). The following training epochs (13000 iterations) will help to improve the details of the images like the mouth or eyes of the cat.

Part 2: CycleGAN

For the second part of this assignment, I implement the CycleGAN architecture.

Implement the Generator of the CycleGAN and the Training Loop

See models.py for the implementation details of CycleGAN model and see cycle_gan.py for the training loop.

CycleGAN Experiments and Results

Early training stage - 600 iterations:

Generated samples without the cycle-consistency loss from 600 iterations.

Generated samples with the cycle-consistency loss from 600 iterations.


Later training stage - 10000 iterations:
Generated samples without the cycle-consistency loss from 10000 iterations.

Generated samples with the cycle-consistency loss from 10000 iterations.

From the results above, we could see that the generated samples with the cycle-consistency loss have better quality than the ones without it, especially for the right collumn where we want to convert the brown cat to the grey cat. This is because adding cycle-consistency loss could add regularization to the training procedure to lead the generated images to be more realistic and preserve low frequency features.

Bells & Whistles

I also try to evaluate the performance of CycleGAN on the provided Pokemon dataset and also on another dataset called Summer2Winter

Pokemon Dataset
Generated samples with the cycle-consistency loss from 10000 iterations.


Summer2Winter Dataset
Generated samples with the cycle-consistency loss from 10000 iterations.