Assignment 5 GAN Photo Editing
Name : Divam Gupta
Andrew ID : divamg
Part 1: Inverting the Generator
Trying different latent space z, w, w+
Latent z
Latent w
Latent w+
For a few cases we can see that the reconstructed outputs for the latent z are very noise in few examples. The reconstructed examples for the latent z are not that close to the input images compared to the latent w and w+.
For most of the examples w and w+ yield similar reconstructions. In the 2nd example we can see that the w+ latent has artifacts in the eyes. In the 3rd example we can see that the wood is not reconstructed very well with the latent w.
Trying different losses
No perceptual loss
perceptual loss weight=1
Only perceptual loss at conv1
We can see that the perceptual loss does not have a significant effect on the reconstructions. With just using the perceptual loss with the conv1 we can see that the reconstructions are a bit more blurry. Small things like eyes are sometimes not reconstructed properly if perceptual loss is used.
Hence in my opinion latent w with no perceptual loss yielded the best results and was used for rest of the experiments.
The method takes about 30 seconds to execute which is pretty reasonable.
Part 2: Interpolate your Cats
Command :
python main.py --latent w --mode interpolate --input "data/cat/*.png"
Output gifs:
Example 1 :
Example 2 :
Using z latent instead of w latent
python main.py --latent z --mode interpolate --input "data/cat/*.png"
We can see that the latent w was better interpolations compared to the latent z. This is because the w latent better encodes the image.
We can see that the interpolations are very smooth. We can see the structureal movements of the cat in the interpolations. The eyes seem to be both moving smoothly and the color also changing smoothly.
Part 3: Scribble to Image
Command :
python main.py --latent w --mode draw --input ./cat_drawings/d2.png
Example 1 :
Example 2 :
Example 3 :
Example 4:
We can see that the algorithm is able to generate reasonable looking cats from the input drawings.
For the denser sketches the outputs look much more better.
For the sparse sketches the model hallucinates more, like like in the example 4.
We can wee that the model is able to map the correct skil color from the input sketches to the output.
In example 2, i purposely added a third eye on the forehead of the cat. The model still produces a realistic looking cat. Although the skin color is a bit lighter around the forehead.