The goal of the assignment is to learn the distribution of the given images and generate them using DCGAN. Then we also look into transfer of one domain to another using Cycle Gan, where we transfer images from two different sets of cat images.
Since we have a small-sized dataset the discriminator can overfit to it. So data augmentation techniques were applied such as random crop and random horizontal flip. We provide some script for you to begin with.
elif opts.data_aug == "deluxe":
transform = transforms.Compose(
[
transforms.Resize([load_size, load_size], Image.BICUBIC),
transforms.RandomResizedCrop(opts.image_size),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
]
)
The DC GAN has the following architecture.
The training loop is implemented as shown in the following pseudocode, similar to a standard GAN. The implementation uses torch.mean, torch.sum and torch.square for the various mathematical terms.
The DCGAN can be trained with the command:
python vanilla_gan.py --num_epochs=100
The losses could be visualized through Tensorboard. These screenshots are shown below, with some sample results
After 800th iteration -
After 8000 iterations -
Cycle GAN takes in image from one domain and spits out image in the other. The architecture used is shown below-
A L2 loss was added as cycle consistency to match the original Y or X image with the generated image created by passing this to both the generators sequentially as shown below in the equation.
The lambda value in equations 7 and 8 above was set to be 100. The cycle consistency loss for Y->X->Y can be written as -
Generated Images after 70k iterations -
After 100k iterations-
The X to Y generation was a bit hard to the generator and the results were not completely accurate for generating the russian blue cats in the Y set.
Using the cycle consistency does result in a slight improvement as it causes the generator to validate the generated images from the given dataset. It also tries to constrain the output in the given domains which can be both good and bad since it delimits generalizability. So a proper value of the cycle consistency weighting parameter lambda is very crucial. For us the results are a bit better and can be seen more clearly in the generated images of the Russian Blue cat, which are better with the cycle consistency loss than without it. For grump cats too, using cycle consistency leads to better generation of consistent face features around the eyes, than without it.