16-726 Learning-Based Image Synthesis

Assignment 4: Neural Style Transfer

Trung Nguyen

Content Reconstruction

Optimizing content loss at different layers

Effect of optimizing content loss at different layers. We can see that the reconstructions seems to get worse if we place the content loss layer deeper in the network.

Original
Conv_1
Conv_2
Conv_3
Conv_4
Conv_5
Favorite

From the previous test, we can see that the loss placed at convolution layer 1,  or 2, or 3 gives better reconstruction results than at convolution 4 and 5, so I picked convolution layer 1 as my best layer to put the content loss. The images below are reconstructed with two random noises with the content loss at convolution layer 1. The results are mostly similar to the original even with two different random noise inputs.

Original
Random Noise 1
Random Noise 2

Texture Synthesis

Optimizing content loss at different layers

Effect of optimizing texture loss at different layers. Placing style loss layer at different layers have more effects than the content loss. The following images demonstrated several experiments at different selection of layers. The general pattern seems to be that the deeper layers capture more finer style detailed such as the brush stroke, while the shallow layers capture the colors.

Original
Conv_1
Conv_2
Conv_3
Conv_4
Conv_5
Con_1+Conv_2
Con_1+Conv_2+Conv_3
Con_1+Conv_2+Conv_3+Conv_4
Con_1+Conv_2+Conv_3+Conv_4+Conv_5
Favorite

After the previous tests, we see that different layers capture different styling aspects of the style images, so it seems more beneficial to put styling layers to all of the 5 convolution layers from 1 to 5. With that selection, the images below are generated with two different random noise. Even thought the pictures are different, the styles are mostly the same.

Original
Random Noise 1
Random Noise 2

Style Transfer

Hyper-parameters tunning

After experimenting with different parameters and selection of layers, I am most satisfied with the selection of convolution layer 1 for the content loss layer, and convolution 1 to 5 for style loss layers. The weight for style loss is 10^6, and the weight for content loss is 1. The results on two images with two different styles are shown below.

Results
Performance comparision

Comparison of quality and running time between random noise and a content image input is shown before. There is no much difference int term of performance between the two inputs. However, the random noise input cannot yield a similar content compared to the one with input as the original content. As a result, the we will only use the content image as a input for the style transfer to get better results.

Content Image
Style Image
Input
Result
Running Time
Random Noise

47.65 seconds

Content Image

49.13 seconds

Favorites Images

The following are the style transfers on some of my favorites images with the two style images that I like the most.

Grumpy cats with styles

The following are the results applied to our grumpy cats from the previous homework.