Style Transfer

16-726 Learning-Based Image Synthesis

Assignment 4: Neural Style Transfer

Trung Nguyen

Content Reconstruction

Optimizing content loss at different layers

Effect of optimizing content loss at different layers. We can see that the reconstructions seems to get worse if we place the content loss layer deeper in the network.

Original

Conv_1

Conv_2

Conv_3

Conv_4

Conv_5

Favorite

From the previous test, we can see that the loss placed at convolution layer 1, or 2, or 3 gives better reconstruction results than at convolution 4 and 5, so I picked convolution layer 1 as my best layer to put the content loss. The images below are reconstructed with two random noises with the content loss at convolution layer 1. The results are mostly similar to the original even with two different random noise inputs.

Original

Random Noise 1

Random Noise 2

Texture Synthesis

Optimizing content loss at different layers

Effect of optimizing texture loss at different layers. Placing style loss layer at different layers have more effects than the content loss. The following images demonstrated several experiments at different selection of layers. The general pattern seems to be that the deeper layers capture more finer style detailed such as the brush stroke, while the shallow layers capture the colors.

Original

Conv_1

Conv_2

Conv_3

Conv_4

Conv_5

Con_1+Conv_2

Con_1+Conv_2+Conv_3

Con_1+Conv_2+Conv_3+Conv_4

Con_1+Conv_2+Conv_3+Conv_4+Conv_5

Favorite

After the previous tests, we see that different layers capture different styling aspects of the style images, so it seems more beneficial to put styling layers to all of the 5 convolution layers from 1 to 5. With that selection, the images below are generated with two different random noise. Even thought the pictures are different, the styles are mostly the same.

Original

Random Noise 1

Random Noise 2

Hyper-parameters tunning

After experimenting with different parameters and selection of layers, I am most satisfied with the selection of convolution layer 1 for the content loss layer, and convolution 1 to 5 for style loss layers. The weight for style loss is 10^6, and the weight for content loss is 1. The results on two images with two different styles are shown below.

Results

Performance comparision

Comparison of quality and running time between random noise and a content image input is shown before. There is no much difference int term of performance between the two inputs. However, the random noise input cannot yield a similar content compared to the one with input as the original content. As a result, the we will only use the content image as a input for the style transfer to get better results.