16-726 Learning-Based Image Synthesis, 2021 Spring
Project 2: Gradient Domain Fusion
Teddy Zhang (wentaiz)
Gradient-domain processing is a classic technique in computational photography with varieties of applications including image blending, tone-mapping, and non-photorealistic rendering. In this project, the major task is to establish a seamless blending between two images with Possion Blending. Another variant that utilizes the mixed gradients is also implemented as a comparison.
The background image in Fig.3, Fig.4 and Fig.5 are all from my personal photo album!
In the toy problem, we are trying to restore an image from its gradient information. For the gradient calculation, we just take the forward difference for both axis.
The details of my implementation are:
- Objectives
- L1=((v(x+1,y)−v(x,y))−(s(x+1,y)−s(x,y)))2
- L2=((v(x,y+1)−v(x,y))−(s(x,y+1)−s(x,y)))2
- Conditions
- v(0,0)=s(0,0). It can be any pixel from the source image.
- Solver: The
lsqr
solver from scipy
package. It is a iterative solver for least square problems.
- Efficiency: To enhance the speed of computation, an alternative version with sparse matrix representation is also created with
csc_matrix
from scipy
. The comparison in terms of the runtime is shown in this table.
Here are the test results of this algorithm:
Fig.1 Left: Original image. Right: Restored image.
|
No. of iterations: 594
|
Residuals (1-Norm): 0.0067
|
We can tell that our implemetation successfully restored the image.
Given a source image and a target image T, we want to blend the images within a predefined region S and solve for the new image V. Here, we are calculating the gradients for each pixel in all 4 directions. The objectives are minimized for each channel separately. The obtained new images are stitched together for the final RGB output.
The details of my implementation are:
- Objectives
- L=∑i∈S,j∈Ni∩S((v(i)−v(j))−(s(i)−s(j)))2+∑i∈S,j∈Ni∩¬S((v(i)−t(j))−(s(i)−s(j)))2
- Conditions
- v(i)=t(i), for i∈¬S. We just directly copy the values from the target image if it is outside of region S.
- Solver: The
lsqr
solver from scipy
package. It is a iterative solver for least square problems.
- Gradient Calculation: Forward difference for all 4 directions. v(i+1,j)−v(i,j), v(i−1,j)−v(i,j), v(i,j+1)−v(i,j), v(i,j−1)−v(i,j).
- Efficiency: To enhance the speed of computation, an alternative version with sparse matrix representation is also created with
csc_matrix
from scipy
. The comparison in terms of the runtime is shown in this table.
Here are the test results of this algorithm for the given images:
|
|
Source
|
Target
|
Fig.2
|
No. of iterations: R:282,G:285,B:286
|
Residuals (1-Norm): R:2.50,G:2.42,B:2.39
|
We can tell from Fig.2 that the two images are blended well without any evident defect. And more results from some other downloaded images are shown below:
|
|
Source
|
Target
|
Fig.3
|
No. of iterations: R:237,G:237,B:237
|
Residuals (1-Norm): R:2.83,G:1.91,B:1.78
|
|
|
Source
|
Target
|
Fig.4
|
No. of iterations: R:509,G:514,B:515
|
Residuals (1-Norm): R:0.53,G:0.31,B:0.27
|
|
|
Source
|
Target
|
Fig.5
|
No. of iterations: R:496,G:497,B:496
|
Residuals (1-Norm): R:3.74,G:3.01,B:2.74
|
|
|
Source
|
Target
|
Fig.6
|
No. of iterations: R:315,G:312,B:306
|
Residuals (1-Norm): R:5.16,G:5.18,B:5.09
|
We can acquire decent blending results from Fig.3 and Fig.4. However, the resulting images are not as good in Fig.5 and Fig.6. The background in region S looks blurred. The major cause is that the gradients/textures in the target image are mostly larger than those in the source image. When blended, the gradients from the source images are preserved. This makes the resulting image blurred in S. It can be concluded that the Poission blending is not suitable in cases where the gradients in the source image are smaller than those in the target image. One possible way to resolve this issue is shown in Bells & Whistles.
From the aforementioned test results, we found a limitation of the Poisson blending. An better option is to preserve the gradient information as the larger one between the source and target image. In this way, we can keep all the evident textures in both images and avoid the generation of the blurred results.
The details of my implementation are:
- Objectives
- L=∑i∈S,j∈Ni∩S((v(i)−v(j))−dij)2+∑i∈S,j∈Ni∩¬S((v(i)−t(j))−dij)2
here, dij=s(i)−s(j), if ∣s(i)−s(j)∣≥∣t(i)−t(j)∣
dij=t(i)−t(j), if ∣s(i)−s(j)∣<∣t(i)−t(j)∣
- Conditions
- v(i)=t(i), for i∈¬S. We just directly copy the values from the target image if it is outside of region S.
- Solver: The
lsqr
solver from scipy
package. It is a iterative solver for least square problems.
- Gradient Calculation: Forward difference for all 4 directions. v(i+1,j)−v(i,j), v(i−1,j)−v(i,j), v(i,j+1)−v(i,j), v(i,j−1)−v(i,j).
- Efficiency: To enhance the speed of computation, an alternative version with sparse matrix representation is also created with
csc_matrix
from scipy
. The comparison in terms of the runtime is shown in this table.
Here are the test results of this algorithm for some used images:
Fig.7
|
No. of iterations: R:310,G:306,B:302
|
Residuals (1-Norm): R:7.83,G:7.85,B:7.73
|
Fig.8
|
No. of iterations: R:227,G:230,B:232
|
Residuals (1-Norm): R:5.78,G:3.87,B:3.27
|
Fig.9
|
No. of iterations: R:482,G:483,B:483
|
Residuals (1-Norm): R:8.26,G:6.72,B:6.44
|
It can be seen from all 3 comparisions that the blurred region in the Poisson blending results are greatly resolved. The resulting blending images are more natural because of the preserved textures. Although, there are still visual defects in Fig.9. The horizon line is hidden by the moon. A potential strategy to tackle this problem is that we can define a threshold value Th based on the intensity of the target region. If t(i)<Th, we calculate the gradient as v(i)−t(j). I tested my hypothesis with the setup below:
- The background image is converted to LAB space so that the L channel can be exacted to obtain the brightness for each pixel.
- Th=5%×max(L)
The resulting blending image is shown below. We can tell that it is visually improved compared with the original mix gradients.
Fig.10
|
When converting the color image to grayscale, we usually lose the important information from color contrast, which makes it difficult to understand the resulting image. In this part, we want to smartly generate a grayscale image based on the original grayscale image with the gradients information from certain channel preserved.
|
|
Color image
|
Grayscale from cv2 package
|
First of all, we need to choose which channel we should use to obtain the gradient information. Here, we convert the original RGB image to HSV and run the Sobel filter on each channel. The obtained gradient images are:
|
|
|
H gradients
|
S gradients
|
V gradients
|
Using the cv2 grayscale image as background, we choose each of the H,S and V channel as the foreground and generate the blending image with the mixed gradient method decribed above. Among the results, the one using V channel yields to the best performance. We can tell from the results below that "35" is still recognizable after my grayscale convertion.
Interestingly, I also found an acceptable result when choosing the S channel as the background and V channel as the foreground. Although the digits are also recognizable, the textures/gradients in the digit region are not perfectly preserved.
Using the same strategy described above, some more test results are shown below:
We can tell from the results that the resulting grayscale images are decent when the digits are colorful but the background is less saturated. However, for some downloaded images with both colorful digits and background, the digits are not as recognizable.
Operations |
Regular Storage(s) |
Sparse Storage(s) |
Toy problem |
9.49 |
0.49 |
Possion Blending |
285.50 |
2.04 |
Mixed Gradients |
287.09 |
2.47 |
Color2Gray |
N/A |
0.89 |
- The basic methods are inspired by CMU 16-726 and Berkeley CS194-26.
- The background image in Fig.3, Fig.4 and Fig.5 are from my personal photo album. All rights reserved.