16-726: Assignment #5 - GAN Photo Editing
Rawal Khirodkar
Part 1: Inverting the Generator.

The projections are results are shown for this image.
DCGAN: The differences are minimal for various loss weights, I found that L2 loss with content loss weight as 1 worked the best across images.


Original Image

Iter: 250 Iter: 500 Iter: 750 Iter: 1000
L2 Loss = 1
Content Loss (conv4) = 0




L2 Loss = 1
Content Loss (conv4) = 0.1




L2 Loss = 1
Content Loss (conv4) = 1




L2 Loss = 1
Content Loss (conv4) = 10




StyleGAN: StyleGAN works much better than DCGAN and captures high fidelity features like nose and whiskers. Again, the differences are minimal for various loss weights, I would that L2 loss with content loss weight as 1 worked the best across images.


Original Image

z, Iter: 250 z, Iter: 500 z, Iter: 750 z, Iter: 1000         w, Iter: 250 w, Iter: 500 w, Iter: 750 w, Iter: 1000         w+, Iter: 250 w+, Iter: 500 w+, Iter: 750 w+, Iter: 1000
L2 Loss = 1
Content Loss (conv4) = 0




       



       



L2 Loss = 1
Content Loss (conv4) = 0.1




       



       



L2 Loss = 1
Content Loss (conv4) = 1




       



       



L2 Loss = 1
Content Loss (conv4) = 10




       



       



Comments. Style GAN works better than DCGAN. For the StyleGAN, w+ latent space produces highest quality results with content loss weight as 1. Interestingly, even using just l2 loss gives good results in my implementation. StyleGAN (200 secs) is 5 times more slower than DCGAN (40 secs). The z space is not effective for StyleGAN evident from the second row of the results, the generated cat looks very different than the original cat.

Part 2: Interpolate your Cats. I used the StyleGAN w+ with 1 weight for the content loss to generate the following interpolation results.


Start
                         
Interpolation
                         
End

Start
                         
Interpolation
                         
End

Start
                         
Interpolation
                         
End

Part 3: Scribble to Image.

First I tested how DCGAN performs on this task with optimal hyperparameters. As can be seen, the DCGAN is not able to capture high fidelity details of the cat. Also, dense sketches are challenging.

Sketch
                         
Mask
                         
DCGAN Output

I switched to StyleGAN as a result. I sampled 1000 different latent vectors and picked the one with minimum loss with the sketch image as the initial guess. This initial guess acts as an anchor to regularize the latent vector during optimization. I used masked l2 loss, content loss (conv 4) and distance from the anchor as my losses. The weights which worked for me after a grid search were [l2 loss: 10, content loss: 0.1, style loss: 0, distance from anchor: 10]. The regularization is absolutely critical to generate realistic cats, otherwise we overfit to the sketch. Here are some results on the provided sketches,

Sketch
                         
Mask
                         
StyleGAN Output

Sketch
                         
Mask
                         
StyleGAN Output

Sketch
                         
Mask
                         
StyleGAN Output

Sketch
                         
Mask
                         
StyleGAN Output

Sketch
                         
Mask
                         
StyleGAN Output
Results on some custom sketches,

Sketch
                         
Mask
                         
StyleGAN Output

Sketch
                         
Mask
                         
StyleGAN Output

Sketch
                         
Mask
                         
StyleGAN Output

Part 4: Bells and Whistles: HighResGAN

Here are the results on the high resolution cats!

Initial Latent Vector Image
                         
Target
                         
Output

Initial Latent Vector Image
                         
Target
                         
Output

Start
                         
Interpolation
                         
End

Start
                         
Interpolation
                         
End