Assignment #2 - Gradient Domain Fusion

by Zijie Li (zijieli@andrew.cmu.edu)

Overview

In this project, the goal is to leverage gradient domain methods to tackle image blending problem. Particularly, instead of changing pixels' intensities directly, we manipulate the gradient field of images.

Effect Preview (From Course Assignment webpage)

Penguin "seamlessly" blended into climber

Part1.1 Toy Problem

For toy problem, we are solving a least square problem where we try to minimize the gradient difference between synthesized image and source image in both the x and y direction. Assuming the size of source image is H*W, we'll got 2*H*W+1 equations in total. Notice here we need to add scale constraint to the synthesized image otherwise its value will explode. The equations comprise following parts:
1. For each pixel, we minimize gradient in x direction: \(((v[x+1,y]-v[x,y]) - (s[x+1,y]-s[x,y]))^2\), which brings H*W equations;
2. For each pixel, we minimize gradient in y direction: \(((v[x,y+1]-v[x,y]) - (s[x,y+1]-s[x,y]))^2\), which brings H*W equations;
3. For first pixel (top left corner), we minimize the pixel difference of it between source: \((v[0,0]-s[0,0])^2\), which brings 1 equation;

To solve this least squares problem I reformulate the above equations into: \(Av=b\), where b is the gradient from source image and A is a sparse matrix arranged based on above equations. I apply Scipy.sparse.linalg.lsqr function to get v.

Source image

Synthesized image

Part1.2 Poisson Blending

Similar to the toy problem, but in this part, we are dealing with two different images, the source image and the target image. The task is that we want to crop a certain part from source image, do some modification and clone it into target image seamlessly. To achieve "seamless", we adopt similar strategy as toy problem.
We use gradient of source image as guidance field, but on the boundary, we force the pixel to be consistent with the target image. In this way, we can obtain an image which have consistent texture and color with target image while pertain the shape in the original source image.
Mathematically, the problem can be defined as an optimization problem:
$$\text{min}\iint_{\Omega}|\nabla f - \mathbf{v}|^2, $$ $$f|_{\partial \Omega} = f^*|_{\partial \Omega}$$ where \(\mathbf{v}\) is the guidance field and here a reasonable choice would be the gradient from source image: \(\mathbf{v}=\nabla s\). And f is the image we want to create.
For image, the discretized version of above problem is: $$\boldsymbol{v} = \text{argmin}_{\boldsymbol{v}} \sum_{i\in S, j\in N_i \cap S}((v_i - v_j) - (s_i - s_j))^2 + \sum_{i \in S, j \in N_i \cap\neg S}((v_i - t_j) - (s_i - s_j))^2$$ The unique solution for above Poisson equation with Dirichlet boundary conditions is: $$\Delta f = \text{div} \mathbf{v} = \Delta s,$$ where \(\Delta\) denotes Laplacian operator.
Therefore, the blending process is quite straighforward, we build a discretized Laplacian filter \(A\) (Refer to the code for implementation detail), then we solve the following linear system \(Av=b, (b=As)\), where v is the blended image and s is the source image.
Some results are listed as below.

"Welcome to CMU, Leo!" (Favorite one)

Leonardo DiCaprio
CMU Lawn

Example data

Source image 1
Target image 1

"Huge cat and climber"

Cat
Climber

"Statue of Liberty moved to Arctic"

Statue of Liberty
Mystery iceland

Although Poisson blending is fast and efficient for simple blending, it is likely to fail on some extreme cases. In general, for images that have very different textures (e.g. wood floor vs grass land, or very different lighting) and colorization (e.g. cute cartoon vs foggy and dark theme), Poisson blending cannot give satisfiable results. (See below)

Spongebob in StarWar. There is noticeable seam and artifacts.

Leo on NBA final court. The surrounding of source image cannot be blended into target well.

Bells and Whistles

1. Mixed Gradients

Mixed gradient is useful when we are trying to clone image with lot of holes (the structure in the image is sparse) to some other image. In this case, we want to perserve the gradients of background (target image). A simple way to achieve this is to tweak the guidance field \(\mathbf{v}\) a bit:
$$v_{pq}=s_p-s_q, \text{ if } |s_p - s_q| > |t_p - t_q| \text{ else } t_p - t_q,$$ for all pairs of \([p, q]\). Below are some examples. We can see that compared to original Poisson blending, mixed gradient can make source image being more transparent by using gradient from background to construct image.



2. Color2gray

The Color2gray problem can be described as, we would like to convert RGB images into grayscale images by considering both brightness intensities and color changes. Since naive rgb2gray method uses only the brightness information, the color changes are not preserved. Here we can adopt the strategy from mixed gradient. First, we convert the image from RGB color space to HSV color space and take Saturation channel S and Brightness Channel V. The Saturation channel can provide us the information of color changing. Then we use the mixed gradient of S and V as guidance field to synthesize a "colorful" gray image.

Raw image (From course website)

cv2.COLOR2GRAY

Color2gray based on mixed gradients

References

[1] https://www.cs.jhu.edu/~misha/Fall07/Papers/Perez03.pdf