**CMU 16-726 Learning-based Image Synthesis** **Assignment #2** *Title: "Gradient Domain Fusion"* *Name: Soyong Shin (soyongs@andrew.cmu.edu)* (##) Contents * Part 1.1 Toy Problem * Part 1.2 (1) Poisson Blending * Part 1.2 (2) Mixed Blending * Other Results (##) Part 1.1 Toy Problem To start with, I implemented optimization-based image reconstruction using gradient domain with examplar toy image.

This method is implemented on `toy_recon` function at **`main.py`**. This is done by build a linear equation of gradient domain and solve it using least-square-error optimization. The objective functions are consist of following three term: $$ f_{obj, 1} = ((v(x+1, y) - v(x, y)) - (s(x+1, y) - s(x, y)))^2 $$ $$ f_{obj, 2} = ((v(x, y+1) - v(x, y)) - (s(x, y+1) - s(x, y)))^2 $$ $$ f_{obj, 3} = (v(1, 1) - s(1, 1))^2 $$ where $f_{obj, 1}$ and $f_{obj, 2}$ are terms to minimize the difference of gradient domain, and $f_{obj, 3}$ is the constraint for the top left corner to be same with the source image. Since the given source image has size of $119 \times 110$, each number of $f_{obj, 1}$ and $f_{obj, 2}$ is 11,863 ($118 \times 109 + 1$) where the last 1 is for the bottom right corner. Therefore, the number of objective terms for this task is $2 \times 11,863 + 1 = 25,727$ and A has shape of $25,727 \times 13090$. For this reason, it is too computationally costy to solve this using conventional least-square-error solver (e.g., numpy.linalg.lstsq). Hence, I utilized sparse matrix (scipy.sparse.csr_matrix) and solver (scipy.sparse.linalg.lsqr) to obtain the optimal solution. The code is implemented as following.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
def toy_recon(image):
    
    imh, imw = image.shape
    im2var = np.arange(imh * imw).reshape((imh, imw)).astype(int)
 
    rows, cols, datas, Bs, e = [], [], [], [], 0
    for y in range(imh - 1):
        for x in range(imw - 1):
            # Objective function 1 & 2            
            rows += [e, e, e+1, e+1]; e += 2
            cols += [im2var[y, x+1], im2var[y, x]]
            cols += [im2var[y+1, x], im2var[y, x]]
            datas += [1-11-1]
 
            Bs += [image[y, x+1- image[y, x]]
            Bs += [image[y+1, x] - image[y, x]]
 
    # Objective 1 & 2 for the bottom right corner
    rows += [e, e, e+1, e+1]; e += 2
    cols += [im2var[y+1, x+1], im2var[y+1, x]]
    cols += [im2var[y+1, x+1], im2var[y, x+1]]
    datas += [1-11-1]
    Bs += [image[y+1, x+1- image[y+1, x]]
    Bs += [image[y+1, x+1- image[y, x+1]]
    
    # Objective function 3
    rows += [e]
    cols += [0]
    datas += [1]
    Bs += [image[00]]
    A = csr_matrix((datas, (rows, cols)), shape=(e+1, imh * imw))
    B = np.stack(Bs)
 
    image_hat = (lsqr(A, B)[0]).reshape(imh, imw)
            
    return image_hat
cs

**Results of the problem**
By running the code, I can get the output image as below. ![figure [toy_problem]: Toy Problem Result](data/toy_problem_result.png) (##) Part 1.2 (1) Poisson Blending Then I implemented gradient domain fusion using poisson blending method. This method is similar to the first toy problem, but the composition of the objective function is different. Suppose that we have source image $s$ and target image $t$. Then the objective terms are: $$ f_{obj, 1} = ((v(x+1, y) - v(x, y)) - (s(x+1, y) - s(x, y)))^2 $$ $$ f_{obj, 2} = ((v(x, y+1) - v(x, y)) - (s(x, y+1) - s(x, y)))^2 $$ $$ f_{obj, 3} = ((v(x+1, y) - t(x, y)) - (s(x+1, y) - s(x, y)))^2 $$ $$ f_{obj, 4} = ((v(x, y+1) - t(x, y)) - (s(x, y+1) - s(x, y)))^2 $$ The first two terms ($f_{obj, 1}$ and $f_{obj, 2}$) are to conserve gradient domain of the source image so that its characteristic to be remained in the background image. These terms are applied over all pixels in the mask. The last two terms ($f_{obj, 3}$ and $f_{obj, 4}$) are designed to fuse source image naturally with the target image by seetting similar gradient domain around the boundary. These terms are only applied the pixels **around the boundary**, and I defined the boundary if the pixel has both `True` and `False` mask within two neighbor pixels. The poisson blending code is implemented as below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
for _c in range(ch):
    rows, cols, datas, Bs, e, p = [], [], [], [], 03
    for y, y_val in enumerate(ys[:-1]):
        for x, x_val in enumerate(xs[:-1]):
            if not mask[y_val:y_val+p, x_val:x_val+p].any():
                # This pixel does not belong to the mask
                continue
 
            # Check if this pixel belongs to mask perimeter (edge)
            is_perm = False if mask[y_val:y_val+p, x_val:x_val+p].all() else True
            
            # Objective function 1: Source image gradient
            rows += [e, e, e+1, e+1]; e += 2
            cols += [im2var[y+1, x], im2var[y, x]]
            cols += [im2var[y, x+1], im2var[y, x]]
            datas += [1-11-1]
            Bs += [fg_crop[y+1, x, _c] - fg_crop[y, x, _c]]
            Bs += [fg_crop[y, x+1, _c] - fg_crop[y, x, _c]]
 
            # Objective function 2: Perimeter background image gradient
            if is_perm:
                rows+= [e, e+1]; e += 2
                cols += [im2var[y, x], im2var[y, x]]
                datas += [11]
                Bs += [bg_crop[y+1, x, _c] + fg_crop[y, x, _c] - fg_crop[y+1, x, _c]]
                Bs += [bg_crop[y, x+1, _c] + fg_crop[y, x, _c] - fg_crop[y, x+1, _c]]
 
    # Optimize blending
    A = csr_matrix((datas, (rows, cols)), shape=(e, mask_hat.shape[0]))
    B = np.stack(Bs)
    mask_hat[:, _c] = (lsqr(A, B)[0])
 
# Fuse blended mask with foreground
fg_ = np.zeros_like(fg)
mask_hat = mask_hat.reshape(*fg_crop.shape[:-1], 3)
fg_[ys[0]:ys[-1], xs[0]:xs[-1]] = mask_hat[:-1, :-1]
    
return fg_ * mask + bg * (1 - mask)
cs
Since the most regions in the source image are not used for fusion, we do not necessarily calculate gradient domain and optimize for those regions. To do this, I made a bounding box around the mask and only consider pixels within the bbox. By doing this, the size of A matrix is $15,866 \times 9,270$, while it was $16,026 \times 83,250$
**Results of the problem**
By running the code, I can get the output image as below. ![figure [poisson_example]: Source Image](data/source_01.jpg) ![figure [poisson_example]: Mask](data/mask_01.jpg) ![figure [poisson_example]: Target Image](data/target_01.jpg) ![figure [poisson_example]: Naive Blending Result](data/naive_01.png) ![figure [poisson_example]: Poisson Blending Result](data/poisson_01.png) (##) Part 1.2 (2) Mixed Blending (Bell & Whistle) I also implemented Mixed Blending method for an extra credit. This blending method is mostly same with poisson blending, however, it compares the graident domain between source and target images and take the strong one to solve the problem. Therefore, for every pixels, I obtain $d_{ij} = max((s_i - s_j), (t_i - t_j), \hspace{3mm} key=abs)$. Here, $i$ and $j$ are neighbor pixels and $key=abs$ is to compare the absolute values of two terms and take the one which has bigger absolute value. The objective terms are as below: $$ f_{obj, 1} = ((v(x+1, y) - v(x, y)) - d((x+1, y), (x, y)))^2 $$ $$ f_{obj, 2} = ((v(x, y+1) - v(x, y)) - d((x, y+1), (x, y)))^2 $$ $$ f_{obj, 3} = ((v(x+1, y) - t(x, y)) - d((x+1, y), (x, y)))^2 $$ $$ f_{obj, 4} = ((v(x, y+1) - t(x, y)) - d((x, y+1), (x, y)))^2 $$ Same as Poisson Blending, the last two terms are applied for the pixels at the boundary of the mask. The code implementation is as below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
for _c in range(ch):
    rows, cols, datas, Bs, e, p = [], [], [], [], 03
    for y, y_val in enumerate(ys[:-1]):
        for x, x_val in enumerate(xs[:-1]):
            if not mask[y_val:y_val+p, x_val:x_val+p].any():
                # This pixel does not belong to the mask
                continue
 
            # Check if this pixel belongs to mask perimeter (edge)
            is_perm = False if mask[y_val:y_val+p, x_val:x_val+p].all() else True
 
            # Find dij by comparing gradient of fore/back-ground images
            dy = max(fg_crop[y+1, x, _c] - fg_crop[y, x, _c], bg_crop[y+1, x, _c] - bg_crop[y, x, _c], key=abs)
            dx = max(fg_crop[y, x+1, _c] - fg_crop[y, x, _c], bg_crop[y, x+1, _c] - bg_crop[y, x, _c], key=abs)
 
            # Objective function 1
            rows += [e, e, e+1, e+1]; e += 2
            cols += [im2var[y+1, x], im2var[y, x]]
            cols += [im2var[y, x+1], im2var[y, x]]
            datas += [1-11-1]
            Bs += [dy, dx]
 
            # Objective function 2
            if is_perm:
                rows+= [e, e+1]; e += 2
                cols += [im2var[y, x], im2var[y, x]]
                datas += [11]
                Bs += [bg_crop[y+1, x, _c] + dy]
                Bs += [bg_crop[y, x+1, _c] + dx]
 
    # Optimize blending
    A = csr_matrix((datas, (rows, cols)), shape=(e, mask_hat.shape[0]))
    B = np.stack(Bs)
    mask_hat[:, _c] = (lsqr(A, B)[0])
 
# Fuse blended mask with foreground
fg_ = np.zeros_like(fg)
mask_hat = mask_hat.reshape(*fg_crop.shape[:-1], 3)
fg_[ys[0]:ys[-1], xs[0]:xs[-1]] = mask_hat[:-1, :-1]
 
return fg_ * mask + bg * (1 - mask)
cs

**Results of the problem**
Using same examplar image with Poisson Blending, the result is shown below. *Examplary image* ![figure [mixed_example]: Naive Blending](data/naive_01.png) ![figure [mixed_example]: Poisson Blending](data/poisson_01.png) ![figure [mixed_example]: Mixed Blending](data/mixed_01.png) Here are some more examples. *Me and Einstein* ![figure [mixed_example]: Naive Blending](data/naive_03.png) ![figure [mixed_example]: Poisson Blending](data/poisson_03.png) ![figure [mixed_example]: Mixed Blending](data/mixed_03.png) *Fiance at a Luxury Home* ![figure [mixed_example]: Naive Blending](data/naive_08.png) ![figure [mixed_example]: Poisson Blending](data/poisson_08.png) ![figure [mixed_example]: Mixed Blending](data/mixed_08.png) *Me and Friends at Mt. Everest* ![figure [mixed_example]: Naive Blending](data/naive_12.png) ![figure [mixed_example]: Poisson Blending](data/poisson_12.png) ![figure [mixed_example]: Mixed Blending](data/mixed_12.png) (##) Other Results In this section, I report more results that I have tried myself. *Gull on the Sky* ![figure [mixed_example]: Naive Blending](data/naive_02.png) ![figure [mixed_example]: Poisson Blending](data/poisson_02.png) *Iron Man at Schenley Park* ![figure [mixed_example]: Naive Blending](data/naive_04.png) ![figure [mixed_example]: Poisson Blending](data/poisson_04.png) *Picachu at CMU* ![figure [mixed_example]: Naive Blending](data/naive_05.png) ![figure [mixed_example]: Poisson Blending](data/poisson_05.png) *Family at Windows-Background* ![figure [mixed_example]: Naive Blending](data/naive_06.png) ![figure [mixed_example]: Poisson Blending](data/poisson_06.png) *Parents at NYC Street* ![figure [mixed_example]: Naive Blending](data/naive_07.png) ![figure [mixed_example]: Poisson Blending](data/poisson_07.png) *Yaught, Gull, and Crab in the Beach* ![figure [mixed_example]: Naive Blending](data/naive_09.png) ![figure [mixed_example]: Poisson Blending](data/poisson_09.png)