In this project, we implemented two types of GANs. The first one is Deep Convolutional GAN that generates grumpy cat faces from random noise. The second one is CycleGAN that translates between Grumpy and Russian Blue.
In order to increase variety of data and reduce overfitting, the data augmentation randomly crops and flip the image.
Padding is calculated by equation $\frac{W}{2}=\frac{W-K+2P}{S}+1$. Plugging in the values we get the padding is 1. Same padding applies to generator's deconv layer.
This is the training loss for DCGAN without data augmentation. This is the training loss for DCGAN with data augmentation. For a GAN to converge, the discriminator loss should not be too small, otherwise the generator will fail to generate any output that gets through the discriminator. Also the loss should smoothly decrease without oscillation, as it suggest that training is unstable and certain problems like mode collapse might have happened.
Output at 400 epoch Output at 1400 epoch Output at 6400 epoch
Output at 600 epoch
Output at 1400 epoch
Output at 6400 epoch
For both methods, the output only shows some color patches at early epochs (around 400). At around 1400 epochs, the output starts to show the outline of the cat's face. At 6400 epoch, the basic method still have some problems with the eyes of cat and white colored region, the deluxe method looks very promising with some minor defects.
Training result without cycle consistency in 600 epoch.
Training result with cycle consistency in 600 epoch.
From the above results it is hard to see the benefit of cycle consistency. At 600 epoch it is not obvious as the outputs are very blurry.
The comparison of loss between these two methods shows that the cycle consistency yields smaller loss.
The result at 10000 epoch is much clearer and defected artifacts are few. Also, it is obvious that the generated face follows certain features of the original image, such as face shape and orientation.