16-726 Learning-Based Image Synthesis
Assignment 4: Neural Style Transfer
Trung Nguyen
Content Reconstruction
Optimizing content loss at different layers
Effect of optimizing content loss at different layers. We can see that the reconstructions seems to get worse if we place the content loss layer deeper in the network.
Original
Conv_1
Conv_2
Conv_3
Conv_4
Conv_5
Favorite
From the previous test, we can see that the loss placed at convolution layer 1, or 2, or 3 gives better reconstruction results than at convolution 4 and 5, so I picked convolution layer 1 as my best layer to put the content loss. The images below are reconstructed with two random noises with the content loss at convolution layer 1. The results are mostly similar to the original even with two different random noise inputs.
Original
Random Noise 1
Random Noise 2
Texture Synthesis
Optimizing content loss at different layers
Effect of optimizing texture loss at different layers. Placing style loss layer at different layers have more effects than the content loss. The following images demonstrated several experiments at different selection of layers. The general pattern seems to be that the deeper layers capture more finer style detailed such as the brush stroke, while the shallow layers capture the colors.
Original
Conv_1
Conv_2
Conv_3
Conv_4
Conv_5
Con_1+Conv_2
Con_1+Conv_2+Conv_3
Con_1+Conv_2+Conv_3+Conv_4
Con_1+Conv_2+Conv_3+Conv_4+Conv_5
Favorite
After the previous tests, we see that different layers capture different styling aspects of the style images, so it seems more beneficial to put styling layers to all of the 5 convolution layers from 1 to 5. With that selection, the images below are generated with two different random noise. Even thought the pictures are different, the styles are mostly the same.
Original
Random Noise 1
Random Noise 2
Style Transfer
Hyper-parameters tunning
After experimenting with different parameters and selection of layers, I am most satisfied with the selection of convolution layer 1 for the content loss layer, and convolution 1 to 5 for style loss layers. The weight for style loss is 10^6, and the weight for content loss is 1. The results on two images with two different styles are shown below.
Results
Performance comparision
Comparison of quality and running time between random noise and a content image input is shown before. There is no much difference int term of performance between the two inputs. However, the random noise input cannot yield a similar content compared to the one with input as the original content. As a result, the we will only use the content image as a input for the style transfer to get better results.
Content Image
Style Image
Input
Result
Running Time
Random Noise
47.65 seconds
Content Image
49.13 seconds
Favorites Images
The following are the style transfers on some of my favorites images with the two style images that I like the most.
Grumpy cats with styles
The following are the results applied to our grumpy cats from the previous homework.