16726 HW4 Neural Style Transfer

In [1]:
from PIL import Image
import matplotlib.pyplot as plt

1. Content Reconstruction

Q1.1 Effect of optimizing different layers

I tried the content loss from conv1 to conv5. I found that as I go deeper in the conv layer, the reconstructed image will lose some detail and have more noise. The visualizations are shown below.

In [10]:
im1 = Image.open('results/conv1_content.jpg')
im2 = Image.open('results/conv2_content.jpg')
im3 = Image.open('results/conv3_content.jpg')
im4 = Image.open('results/conv4_content.jpg')
im5 = Image.open('results/conv5_content.jpg')

plt.figure(figsize=(20,10))
plt.axis('off')
plt.subplot(2,3,1)
plt.title('conv1 content')
plt.imshow(im1)
plt.subplot(2,3,2)
plt.title('conv2 content')
plt.imshow(im2)
plt.subplot(2,3,3)
plt.title('conv3 content')
plt.imshow(im3)
plt.subplot(2,3,4)
plt.title('conv4 content')
plt.imshow(im4)
plt.subplot(2,3,5)
plt.title('conv5 content')
plt.imshow(im5)
Out[10]:
<matplotlib.image.AxesImage at 0x15222ae71f0>

Q1.2 two random noise input

The visualization of the output of two random noise input are shown below. I chose the conv4 layer to optimize the content loss. Comparing the output of two random noise input, we can see that the their optimization results look pretty similar. However, they both are not as sharp as the original content image.

In [17]:
im1 = Image.open('results/conv4_content_input1.jpg')
im2 = Image.open('results/conv4_content_input2.jpg')
im3 = Image.open('results/content_img.jpg')


plt.figure(figsize=(20,10))
plt.axis('off')
plt.subplot(1,3,1)
plt.title('conv4 content input1')
plt.imshow(im1)
plt.subplot(1,3,2)
plt.title('conv4 content input2')
plt.imshow(im2)
plt.subplot(1,3,3)
plt.title('original content image')
plt.imshow(im3)
Out[17]:
<matplotlib.image.AxesImage at 0x15223151e50>

2. Texture Synthesis

Q2.1 Effect of optimizing different layers

I tried the style loss from conv1 to conv5 as well as using all conv1-conv5 features. I found that as I go deeper in the conv layer, the generated style becomes less blurry and contains more details. When optimizing on all conv1-conv5 layers, the output contians the color from earlier layers while also maintains decent texture details from later layers. The visualizations are shown below.

In [14]:
im1 = Image.open('results/conv1_style.jpg')
im2 = Image.open('results/conv2_style.jpg')
im3 = Image.open('results/conv3_style.jpg')
im4 = Image.open('results/conv4_style.jpg')
im5 = Image.open('results/conv5_style.jpg')
im6 = Image.open('results/style_all.jpg')

plt.figure(figsize=(20,10))
plt.axis('off')
plt.subplot(2,3,1)
plt.title('conv1 style')
plt.imshow(im1)
plt.subplot(2,3,2)
plt.title('conv2 style')
plt.imshow(im2)
plt.subplot(2,3,3)
plt.title('conv3 style')
plt.imshow(im3)
plt.subplot(2,3,4)
plt.title('conv4 style')
plt.imshow(im4)
plt.subplot(2,3,5)
plt.title('conv5 style')
plt.imshow(im5)
plt.subplot(2,3,6)
plt.title('all conv style')
plt.imshow(im6)
Out[14]:
<matplotlib.image.AxesImage at 0x15222a6ccd0>

Q2.2 Two random noise input

The visualization of the output of two random noise input are shown below. I used all conv1-conv5 layer to optimize the style loss. Comparing the output of two random noise input, we can see that although the layout of textures are different, but the styles captured are quite similar. Comparing to the original style image, I think the optimization outputs are able to capture the style in the image quite well.

In [20]:
im1 = Image.open('results/style_all_input1.jpg')
im2 = Image.open('results/style_all_input2.jpg')
im3 = Image.open('results/style_img.jpg')


plt.figure(figsize=(20,10))
plt.axis('off')
plt.subplot(1,3,1)
plt.title('all conv style input1')
plt.imshow(im1)
plt.subplot(1,3,2)
plt.title('all conv style input2')
plt.imshow(im2)
plt.subplot(1,3,3)
plt.title('original style image')
plt.imshow(im3)
Out[20]:
<matplotlib.image.AxesImage at 0x15225b8ec10>

3. Style Transfer

Q3.1 Implementation detail

I used the VGG netowrk pretraiend on ImageNet as the feature extractor. I chose the conv4 layer for content loss optimization and all the conv1-conv5 layers for the style loss optimization. The weight for the content loss is 1 and the weight for the style loss is 100000. I used LBFGS optimizer and optimzed the loss for 300 steps.

Q3.2 Style Transfer results

In [27]:
im1 = Image.open('results/content_img.jpg')
im2 = Image.open('results/dance_content_img.jpg')
im3 = Image.open('results/style_img.jpg')
im4 = Image.open('results/night_style_img.jpg')
im5 = Image.open('results/dance_pica.jpg')
im6 = Image.open('results/dance_starry.jpg')
im7 = Image.open('results/water_pica.jpg')
im8 = Image.open('results/water_starry.jpg')

plt.figure(figsize=(20,10))
plt.axis('off')
plt.subplot(3,3,2)
plt.title('origin content')
plt.imshow(im1)
plt.subplot(3,3,3)
plt.title('origin content')
plt.imshow(im2)
plt.subplot(3,3,4)
plt.title('origin style')
plt.imshow(im3)
plt.subplot(3,3,5)
# plt.title('conv4 style')
plt.imshow(im7)
plt.subplot(3,3,6)
# plt.title('conv5 style')
plt.imshow(im5)
plt.subplot(3,3,7)
plt.title('origin style')
plt.imshow(im4)
plt.subplot(3,3,8)
# plt.title('all conv style')
plt.imshow(im8)
plt.subplot(3,3,9)
# plt.title('all conv style')
plt.imshow(im6)
Out[27]:
<matplotlib.image.AxesImage at 0x1522a877a90>

Q3.3 Random noise input and content image input

Below are the visualizations of style transfer output using random noise input and content image input. The running time of both setup are approximately the same, about 14.9 seconds. We can see that the output of using random noise as input is more blended into the style, while the output of using content image as input more preserves the color of the person in the original content image.

In [40]:
im1 = Image.open('results/random_input_transfer.jpg')
im2 = Image.open('results/content_input_transfer.jpg')
im3 = Image.open('results/dance_content_img.jpg')


plt.figure(figsize=(20,10))
plt.axis('off')
plt.subplot(1,3,1)
plt.title('random noise input')
plt.imshow(im1)
plt.subplot(1,3,2)
plt.title('content image input')
plt.imshow(im2)
plt.subplot(1,3,3)
plt.title('original content image')
plt.imshow(im3)
Out[40]:
<matplotlib.image.AxesImage at 0x1522c864940>

Q3.4 Style Transfre on my own image

In [39]:
im1 = Image.open('results/my_img1_content_img.jpg')
im1 = im1.rotate(-90)
im2 = Image.open('results/my_img1_starry.jpg')
im2 = im2.rotate(-90)
im3 = Image.open('results/my_img2_content_img.jpg')
im4 = Image.open('results/my_img2_starry.jpg')


plt.figure(figsize=(20,10))
plt.axis('off')
plt.subplot(2,2,1)
plt.title('my content image 1')
plt.imshow(im1)
plt.subplot(2,2,2)
plt.title('my content image 2')
plt.imshow(im3)
plt.subplot(2,2,3)
plt.title('transferred image 1')
plt.imshow(im2)
plt.subplot(2,2,4)
plt.title('transferred image 2')
plt.imshow(im4)
Out[39]:
<matplotlib.image.AxesImage at 0x1522e874df0>