This assignment is for hands-on experience coding and training GANs. This assignment includes two parts: in the first part, we will implement a specific type of GAN designed to process images, called a Deep Convolutional GAN (DCGAN). We will train the DCGAN to generate grumpy cats from samples of random noise. In the second part, we will implement a more complex GAN architecture called CycleGAN for the task of image-to-image translation. We will train the CycleGAN to convert between different types of two kinds of cats (Grumpy and Russian Blue).
$$\frac{input\_size+2P-K}{S} + 1 = output\_size$$ With kernel size K = 4 and stride S = 2, the padding should be P = 1 for the first 4 conv layer, and P = 0 for the last cov layer.
|   |   | 
|   |   | 
Briefly explain what the curves should look like if GAN manages to train. If the GAN manages to train, both D and G losses should reduce to a certain value and then oscillate around the “equilibrium” values. This is because in GAN, the generator and discriminator are competing against each other in a minimax game format, improving one will lead to a higher loss of the other. In the end, they will converge to a fixed value when the model is able to find an optimum, then both loss will oscillate round their “equilibrium” values.
|   |   |   | 
Apparently, the samples from early in training are completely random noise, and gradually they starts to get close to the cat appearance in the training set. Basically during training, the samples improve by first showing the correct outlines and colors, then getting better on details like eyes and noses.
|   |   | 
|   |   | 
|   |   | 
|   |   | 
In general, the results with cycle consistency loss are slightly better than those without cycle consistency loss. The observation coincides with intuition that both XtoY and YtoX generators should be trained towards the direction that their generated images can be successfully cycled back to the original domain and maintain validity when passing through the generator loop. Thus, adding cycle consistency loss would help generate more realistic images. (Though the improvement is more apparent in )
It seems Pokemon has more details than cats, leading to a requirement of more training.
|   |