**CMU 16-726 Learning-based Image Synthesis**
**Assignment #2**
*Title: "Gradient Domain Fusion"*
*Name: Soyong Shin (soyongs@andrew.cmu.edu)*
(##) Contents
* Part 1.1 Toy Problem
* Part 1.2 (1) Poisson Blending
* Part 1.2 (2) Mixed Blending
* Other Results
(##) Part 1.1 Toy Problem
To start with, I implemented optimization-based image reconstruction using gradient domain with examplar toy image.
This method is implemented on `toy_recon` function at **`main.py`**. This is done by build a linear equation of gradient domain and solve it using least-square-error optimization.
The objective functions are consist of following three term:
$$
f_{obj, 1} = ((v(x+1, y) - v(x, y)) - (s(x+1, y) - s(x, y)))^2
$$
$$
f_{obj, 2} = ((v(x, y+1) - v(x, y)) - (s(x, y+1) - s(x, y)))^2
$$
$$
f_{obj, 3} = (v(1, 1) - s(1, 1))^2
$$
where $f_{obj, 1}$ and $f_{obj, 2}$ are terms to minimize the difference of gradient domain, and $f_{obj, 3}$ is the constraint for the top left corner to be same with the source image.
Since the given source image has size of $119 \times 110$, each number of $f_{obj, 1}$ and $f_{obj, 2}$ is 11,863 ($118 \times 109 + 1$) where the last 1 is for the bottom right corner.
Therefore, the number of objective terms for this task is $2 \times 11,863 + 1 = 25,727$ and A has shape of $25,727 \times 13090$.
For this reason, it is too computationally costy to solve this using conventional least-square-error solver (e.g., numpy.linalg.lstsq).
Hence, I utilized sparse matrix (scipy.sparse.csr_matrix) and solver (scipy.sparse.linalg.lsqr) to obtain the optimal solution.
The code is implemented as following.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | def toy_recon(image): imh, imw = image.shape im2var = np.arange(imh * imw).reshape((imh, imw)).astype(int) rows, cols, datas, Bs, e = [], [], [], [], 0 for y in range(imh - 1): for x in range(imw - 1): # Objective function 1 & 2 rows += [e, e, e+1, e+1]; e += 2 cols += [im2var[y, x+1], im2var[y, x]] cols += [im2var[y+1, x], im2var[y, x]] datas += [1, -1, 1, -1] Bs += [image[y, x+1] - image[y, x]] Bs += [image[y+1, x] - image[y, x]] # Objective 1 & 2 for the bottom right corner rows += [e, e, e+1, e+1]; e += 2 cols += [im2var[y+1, x+1], im2var[y+1, x]] cols += [im2var[y+1, x+1], im2var[y, x+1]] datas += [1, -1, 1, -1] Bs += [image[y+1, x+1] - image[y+1, x]] Bs += [image[y+1, x+1] - image[y, x+1]] # Objective function 3 rows += [e] cols += [0] datas += [1] Bs += [image[0, 0]] A = csr_matrix((datas, (rows, cols)), shape=(e+1, imh * imw)) B = np.stack(Bs) image_hat = (lsqr(A, B)[0]).reshape(imh, imw) return image_hat | cs |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | for _c in range(ch): rows, cols, datas, Bs, e, p = [], [], [], [], 0, 3 for y, y_val in enumerate(ys[:-1]): for x, x_val in enumerate(xs[:-1]): if not mask[y_val:y_val+p, x_val:x_val+p].any(): # This pixel does not belong to the mask continue # Check if this pixel belongs to mask perimeter (edge) is_perm = False if mask[y_val:y_val+p, x_val:x_val+p].all() else True # Objective function 1: Source image gradient rows += [e, e, e+1, e+1]; e += 2 cols += [im2var[y+1, x], im2var[y, x]] cols += [im2var[y, x+1], im2var[y, x]] datas += [1, -1, 1, -1] Bs += [fg_crop[y+1, x, _c] - fg_crop[y, x, _c]] Bs += [fg_crop[y, x+1, _c] - fg_crop[y, x, _c]] # Objective function 2: Perimeter background image gradient if is_perm: rows+= [e, e+1]; e += 2 cols += [im2var[y, x], im2var[y, x]] datas += [1, 1] Bs += [bg_crop[y+1, x, _c] + fg_crop[y, x, _c] - fg_crop[y+1, x, _c]] Bs += [bg_crop[y, x+1, _c] + fg_crop[y, x, _c] - fg_crop[y, x+1, _c]] # Optimize blending A = csr_matrix((datas, (rows, cols)), shape=(e, mask_hat.shape[0])) B = np.stack(Bs) mask_hat[:, _c] = (lsqr(A, B)[0]) # Fuse blended mask with foreground fg_ = np.zeros_like(fg) mask_hat = mask_hat.reshape(*fg_crop.shape[:-1], 3) fg_[ys[0]:ys[-1], xs[0]:xs[-1]] = mask_hat[:-1, :-1] return fg_ * mask + bg * (1 - mask) | cs |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | for _c in range(ch): rows, cols, datas, Bs, e, p = [], [], [], [], 0, 3 for y, y_val in enumerate(ys[:-1]): for x, x_val in enumerate(xs[:-1]): if not mask[y_val:y_val+p, x_val:x_val+p].any(): # This pixel does not belong to the mask continue # Check if this pixel belongs to mask perimeter (edge) is_perm = False if mask[y_val:y_val+p, x_val:x_val+p].all() else True # Find dij by comparing gradient of fore/back-ground images dy = max(fg_crop[y+1, x, _c] - fg_crop[y, x, _c], bg_crop[y+1, x, _c] - bg_crop[y, x, _c], key=abs) dx = max(fg_crop[y, x+1, _c] - fg_crop[y, x, _c], bg_crop[y, x+1, _c] - bg_crop[y, x, _c], key=abs) # Objective function 1 rows += [e, e, e+1, e+1]; e += 2 cols += [im2var[y+1, x], im2var[y, x]] cols += [im2var[y, x+1], im2var[y, x]] datas += [1, -1, 1, -1] Bs += [dy, dx] # Objective function 2 if is_perm: rows+= [e, e+1]; e += 2 cols += [im2var[y, x], im2var[y, x]] datas += [1, 1] Bs += [bg_crop[y+1, x, _c] + dy] Bs += [bg_crop[y, x+1, _c] + dx] # Optimize blending A = csr_matrix((datas, (rows, cols)), shape=(e, mask_hat.shape[0])) B = np.stack(Bs) mask_hat[:, _c] = (lsqr(A, B)[0]) # Fuse blended mask with foreground fg_ = np.zeros_like(fg) mask_hat = mask_hat.reshape(*fg_crop.shape[:-1], 3) fg_[ys[0]:ys[-1], xs[0]:xs[-1]] = mask_hat[:-1, :-1] return fg_ * mask + bg * (1 - mask) | cs |