Overview
In this assignment, I will implement a few different techniques that could manipulate images on the manifold of natural images. First, we will invert a pre-trained generator to find a latent variable that closely reconstructs the given real image. In the second part of the assignment, we will take a hand-drawn sketch and generate an image that fits the sketch accordingly. In the end, we will also add a style loss as the texture constraint.
Part 1: Inverting the Generator
For the first part of the assignment, we will solve an optimization problem to reconstruct the image from a particular latent code. Some example outputs using different generative models, latent space or combinations of the losses are shown as follows.
VanillaGAN
Using z latent space:
We could see from the results that the reconstructed images look nice and similiar to the raw image. The one with perception loss does better in details like the eye area. However, both reconstructed images are a little bit blurry and it might because we could choose a better generator.
StyleGAN
Using z latent space:
Using w latent space:
Using w+ latent space:
We could see from the results above that all the reconstructed images look nice and similiar to the raw image. There are some minor differences when using different latent spaces and losses. For each latent space, the ones with perception loss does better in details like the jaw or eye area but the effect is very subtle. Among the three latent spaces, the w and w+ space have smoother colorization than the z space. Comparing to the former results from VanillaGAN, the results here have a much higher image quality.
Part 2: Interpolate your Cats
Now that we have a technique for inverting the cat images, we can do arithmetic with the latent vectors we have just found. One simple example is interpolating through images via a convex combination of their inverses.
Some interpolation results between grumpy cats
Since the reconstruction is better with perception loss, we will keep this loss in the following questions.
For the interpolation visualization, we could see that all the results will adjust the face details and direction in a pretty smooth manner, but change the outside color (from white to brown here) in a coarse and rapid way.
Part 3: Scribble to Image
Next, we would like to constrain our image in some way while having it look realistic. We will initially develop this method in general and then talk about color scribble constraints in particular. We will use w latent space and use L2 loss and perception loss with mask.
Some scribble to image results
We could see that the some generated images are very realistic (1-1, 3-1, 3-2) while some others are more blurry or strange. We find that sparser sketches tend to perform better than the denser ones, which is because it poses less color constrains to the generated images. Some generated images are blur (1-2, 2-2, 2-3), which might need more iterations of optimization. One option here is to add regularization to the latent space so that it could stay close to the original distribution.
Bells & Whistles
I tried to add a style loss as an additional texture constraint.
We could see that the one with style loss does better in the texture details and overall colorization, making it more realistic.