Overview
We use Neural Style Transfer which renders specific content
from an input image in a certain artistic style.
This algorithm takes in a content image, style image, and
another input image. The input image is optimized to match
the previous two target images in content and style distance space.
We experiment with randomizing noise and optimizing in content space,
then style space, then both to perform neural style transfer.
Depending on
Part 1: Content Reconstruction
Content Reconstruction from content image clone
When the input image is the content image, the content loss
remains at 0, which is expected because the output is
already optimal. Thus, the conv_layer at which the content loss
is inserted doesn't matter, and the below two outputs are the same.
Content loss applied at conv 2
Content loss applied at conv 11
Content Reconstruction from Random Noise: Applying at Different Layers
Parameters: [content_weight=10, num_steps= 300]
We note that optimizing content loss in earlier layers enables
the model output to retain a more accurate/sharper reconstruction.
However, at later layers we observe
a few artifacts in the resulting output.
This is because as the model performs operation like maxpooling,
fine details are lost, so the reconstructions are grainier.
Content loss applied at conv 4
Content loss applied at conv 5
Content loss applied at conv 7
Two Inputs from Noise
Content Image: Dancing
Parameters: [applied at conv5, content_weight=10, num_steps=300]
Original:

Output A:

Output B:

The sythesized contents is the same between the two images. However,
the synthesized images have a green-ish tint, and
do not have as much contrast as the original image.
This might be because the "style" isn't being enforced.
so as long as the content (the dancer) is preserved the loss will be low.
Part 2: Texture Synthesis
Texture Synthesis at Different Layers
We now optimize with respect to style loss only,
to generate the style of the style image.
Parameters used: [style_weight = 500k, num_steps = 300]
Applied to conv 1 through 5:

Applied to conv 4 through 8:

Applied to conv 7 through 11:

We note that optimizing style loss in layer layers
creates a denser, grainier reconstruction, where optimizing in
earlier layers yields "larger" patterns. However, in both cases
the general color spectrum of the original image is preserved.
Two Inputs from Noise
Style Image: Picasso
Parameters: [style_weights: 500K , content_weight: 1, num_steps:300]
Synthesizing from two different noise initializations yields
two very similar outputs with the same texture/color
scheme as the original Picasso input, but it does not capture
the figure represented in the original image.
This makes sense since we're only optimizing with respect to
style, so the content does not need to be preserved.
Original Image:

Output A:

Output B:
Part 3: Style Transfer
Implementation Details
For style transfer, we must optimize with respect to
style loss and content loss.
For the model, we inject style/content loss layers after certain
conv layers in VGG-19.
Then, we add both losses to create
a total loss which we backpropagate.
We call optimizer.step(), which uses a L-BFGS
optimizer that approximates the
second derivative (Hessian) for optimization.
The style and content weights can be adjusted
accordingly depending on how intensely we want to
either transfer the style, or retain the original
input image content.
2x2 Grid of Results
Style Image 1: Frida Kahlo
Style Image 2: Starry Night
Content Image 1: Wally
Content Image 2: Dancing
Parameters: [style_weights: 100K , content_weight: 1, num_steps:300]
Input as Random Noise, Content Image
Style Image: The Scream
Content Image: Tubingen
Random Noise (Time: 10.31s):

Parameters: [style_weight: 100K, content_weight: 1, num_steps:300]
With input as noise, the output looks great, and is a good
mixture between the texture from "The Scream" but also preserves
the building's integrity.
Clone (Time: 10.28s):

Parameters: [style_weight=600k, content_weight=100, num_steps:300]
With input as the content image, I had to increase the style
weight, otherwise the output is just the content image.
It doesn't work as well since the transferred
style is not as apparent. However, the time it takes
to generate the output is around the same.
Try on favorite images
Parameters: [style_weights: 100K , content_weight: 1, num_steps:300]
Example 1
Style Image 1: David Hockney 1

Content Image 1: Guinea Pig

Style Transfer Outputs:
Example 2
Style Image 2: David Hockney 2

Content Image 2: Chateau

Style Transfer Outputs:
Bells & Whistles
Styling Poisson-blended images/Grumpy Cats
Example 1: GAN-Generated Grumpy Cat Stylized with David Hockney's Painting
Original Image:

Output Image:
Example 2: Poisson-Blended GuineaPig2 Stylized with David Hockney's Painting
Original Image:

Output Image: