Overview
In this project, the goal is to leverage gradient domain methods to tackle image blending problem. Particularly, instead of changing pixels' intensities directly,
we manipulate the gradient field of images.
Effect Preview (From Course Assignment webpage)
Penguin "seamlessly" blended into climber
|
Part1.1 Toy Problem
For toy problem, we are solving a least square problem where we try to minimize the gradient difference between synthesized image and
source image in both the x and y direction. Assuming the size of source image is H*W, we'll got 2*H*W+1 equations in total. Notice
here we need to add scale constraint to the synthesized image otherwise its value will explode. The equations comprise following parts:
1. For each pixel, we minimize gradient in x direction: \(((v[x+1,y]-v[x,y]) - (s[x+1,y]-s[x,y]))^2\), which brings H*W equations;
2. For each pixel, we minimize gradient in y direction: \(((v[x,y+1]-v[x,y]) - (s[x,y+1]-s[x,y]))^2\), which brings H*W equations;
3. For first pixel (top left corner), we minimize the pixel difference of it between source:
\((v[0,0]-s[0,0])^2\), which brings 1 equation;
To solve this least squares problem I reformulate the above equations into: \(Av=b\), where b is the gradient from source image and A is a sparse matrix
arranged based on above equations. I apply Scipy.sparse.linalg.lsqr function to get v.
Source image
|
Synthesized image
|
Part1.2 Poisson Blending
Similar to the toy problem, but in this part, we are dealing with two different images, the source image
and the target image. The task is that we want to crop a certain part from source image, do some modification
and clone it into target image seamlessly. To achieve "seamless", we adopt similar strategy as toy problem.
We use gradient of source image as guidance field, but on the boundary, we force the pixel to be consistent with
the target image. In this way, we can obtain an image which have consistent texture and color with target image while
pertain the shape in the original source image.
Mathematically, the problem can be defined as an optimization problem:
$$\text{min}\iint_{\Omega}|\nabla f - \mathbf{v}|^2, $$
$$f|_{\partial \Omega} = f^*|_{\partial \Omega}$$
where \(\mathbf{v}\) is the guidance field and here a reasonable choice would be the gradient from source image:
\(\mathbf{v}=\nabla s\). And f is the image we want to create.
For image, the discretized version of above problem is:
$$\boldsymbol{v} = \text{argmin}_{\boldsymbol{v}} \sum_{i\in S, j\in N_i \cap S}((v_i - v_j) - (s_i - s_j))^2 + \sum_{i \in S, j \in N_i \cap\neg S}((v_i - t_j) - (s_i - s_j))^2$$
The unique solution for above Poisson equation with Dirichlet boundary conditions is:
$$\Delta f = \text{div} \mathbf{v} = \Delta s,$$
where \(\Delta\) denotes Laplacian operator.
Therefore, the blending process is quite straighforward, we build a discretized Laplacian filter \(A\) (Refer to the code for implementation
detail), then we solve the following linear system \(Av=b, (b=As)\), where v is the blended image and s is the source image.
Some results are listed as below.
"Welcome to CMU, Leo!" (Favorite one)
Leonardo DiCaprio
|
CMU Lawn
|
|
Example data
Source image 1
|
Target image 1
|
|
"Huge cat and climber"
Cat
|
Climber
|
|
"Statue of Liberty moved to Arctic"
Statue of Liberty
|
Mystery iceland
|
|
Although Poisson blending is fast and efficient for simple blending, it is likely to fail on some extreme cases. In general, for images that have
very different textures (e.g. wood floor vs grass land, or very different lighting) and colorization (e.g. cute cartoon vs foggy and dark theme),
Poisson blending cannot give satisfiable results. (See below)
Spongebob in StarWar. There is noticeable seam and artifacts.
|
Leo on NBA final court. The surrounding of source image cannot be blended into target well.
|
Mixed gradient is useful when we are trying to clone image with lot of holes (the structure in the image is sparse) to some other image. In this case, we want
to perserve the gradients of background (target image). A simple way to achieve this is to tweak the guidance field \(\mathbf{v}\) a bit:
$$v_{pq}=s_p-s_q, \text{ if } |s_p - s_q| > |t_p - t_q| \text{ else } t_p - t_q,$$
for all pairs of \([p, q]\). Below are some examples. We can see that compared to original Poisson blending, mixed gradient can make source image being more transparent
by using gradient from background to construct image.
The Color2gray problem can be described as, we would like to convert RGB images into grayscale images by considering both brightness intensities and color changes.
Since naive rgb2gray method uses only the brightness information, the color changes are not preserved. Here we can adopt the strategy from mixed gradient. First, we
convert the image from RGB color space to HSV color space and take Saturation channel S and Brightness Channel V. The Saturation channel can provide us the information
of color changing. Then we use the mixed gradient of S and V as guidance field to synthesize a "colorful" gray image.
[1] https://www.cs.jhu.edu/~misha/Fall07/Papers/Perez03.pdf