Gradient Domain Fusion

Carnegie Mellon University, 16-726 Learning-Based Image Synthesis, Spring 2021

Null Reaper Logo
Null Reaper (Clive Gomes)

Task

The main goal of this project is to explore gradient-domain processing; specifically, we are interested in using "Poisson Blending" to seamlessly combine multiple images. To achieve this, we shall formulate the Poisson objective function as a least squares problem and solve for the pixel values of the blended image. The original assignment can be found here.

Approach

Before getting into the technical details of this project, let's first look at the inputs needed to run the code.

  1. Foreground Image: This is the image that will be overlayed on top of another. Instead of the complete image, only the object we plan to blend in another image is included; the rest of the image is black.
  2. Background Image: The object from the foreground image will be blended into a specific area of this image.
  3. Mask: The specific area of the background image where the foreground object will be overlayed is represented by the mask (as 1's and 0's).

The Poisson objective function is given below:

Poisson Objective Function

In the equation above, "s" represents the source (foreground) image, "t" denotes the target (background) image, and "v" is the blended image we wish to solve for; the "i" denotes a specific pixel in the images, while "j" represents the four neighboring pixels of "i" (top, down, left and right). One thing that can be inferred from the equation is that the left summation only applicable when neighboring pixels (given by "j") are within the portion of the source image that is blended into the target image (which can be checked by looking the the pixel values of the mask); the right summation is applicable otherwise. Accordingly, we use "if...else" blocks (for each of the 4 neighboring pixels) to decide which summation constraint to use.

To solve the objective function, we first rewrite it as a matrix product Av = b. Here, A is a sparse matrix representing which pixels in the image "v" are used in each constraint. Each row of A represents the left-hand side of one constraint; the right-hand side of the constraint is placed in the same row of vector b. (Note, to have each row represent a constraint equation, the 2D image is represented as a 1D vector).

Once we build the sparse matrix A and vector b from the three inputs described earlier, we use the least squares solver function from the SciPy library (scipy.sparse.linalg.lsqr) to get the 1-D vector "v". We then reshape it into a 2-D image with the same dimensions as the target (backgroud) image.

Finally, we build the final image by taking pixels from the target (background) image for the unmasked region and pixels from the blended image "v" for the masked region. We do this for all 3 color channels (RGB) separately, and return the resulting 3-color blended image.

A Toy Problem

Before writing the code for the Poission Blending function described above, we first tested a simpler example. This was done in order to ensure that we were properly converting the objective function into a sparse matrix A and a vector b.

For this problem, the following single-channel image was used as the input:

Greyscale Image from Toy Story
Figure 1: Toy Problem Input

A set of three simpler objective functions (given below) were used to perform this initial test. Our goal was to find the values of pixels in image "v" that minize these expressions.

Objective Functions for Toy Problem

The exact same procedure described in the previous section was followed here (the difference being the objective functions and the single-channel, instead of multi-channel, image). The objective functions given in this section define a simple identity operation (by looking at the equations, it is clear that the values of pixels in image "v" should be the same as those of image "s" for the objective funtions to be minimum, i.e. 0). The result obtained by running the implemented code is given below.

Identical Input and Output of Toy Problem
Figure 2: Toy Problem Input (Left) vs Output (Right)

As can be seen, the output image is identical to the input image. This means that our conversion from the constraint equations to the matrix-multiplication format is as we expect.

Poisson Blending

Now, we get to the main portion of this project: the Poisson Blending task. To start things off, we select the two input images shown below.

Flying Jet Image Snowy Mountain Image
Figure 3: Input Images for Poisson Blending

Our goal is to blend the flying jet image (foreground) into the sky of the snowy mountain image (background) on the right. To do this, we must first create a mask describing the region we want to blend the images together. We used the "masking_code.py" code (provided with the assignment) to get the following foreground image, background image, and mask, which we will use for our Poisson Blending function.

Foreground Image Background Image Mask
Figure 4: Foreground Image, Background Image, and Mask (from Left to Right)

If we simply copy the pixels in the masked region from the foreground image to the background one, we get the following result:

Imperfectly Blended Image
Figure 5: Direct Copy of Pixels from Foreground to Background Image

Clearly, this is not good enough; hence we need Poisson Blending. The implementation of the Poisson Blending function was exactly as discussed in the "Approach" section (no extra realignment or rotation was performed since the inputs were satisfactory as is.) The Poisson Blending function performed the blending operation for each of the three color channels, and the resulting blended image was as follows:

Poisson Blended Image
Figure 6: Poisson Blended Output

As can be seen above, the image obtained through Poisson Blending is clearly better than the simple "direct copy" method. Thus, we have successfully implemented the Poisson Blending function.

More Examples of Poisson Blending

Bear in the Pool

Bear Swimming Pool Bear Mask
Figure 7: Foreground Image, Background Image, and Mask (from Left to Right)
Imperfectly Blended Image Poisson Blended Image
Figure 8: Direct Blend (Left) vs Poission Blend (Right)

Snowy Cat

Image of Cat Wearing a Cap Image of Snowy Mountain
Figure 9: Input Images
Aligned Image of Cat Wearing a Cap Image of Snowy Mountain Cat in Snowy Montain Image Mask
Figure 10: Foreground Image, Background Image, and Mask (from Left to Right)
Imperfectly Blended Image Poisson Blended Image
Figure 11: Direct Blend (Left) vs Poission Blend (Right)

Space Shark

Image of Shark Underwater Image of Planet in Space
Figure 12: Input Images
Aligned Image of Shark in Space Image of Planet in Space Shark in Space Mask
Figure 13: Foreground Image, Background Image, and Mask (from Left to Right)
Imperfectly Blended Image Poisson Blended Image
Figure 14: Direct Blend (Left) vs Poission Blend (Right)

Meloetta in the Park

Image of Meloetta Image of a Park
Figure 15: Input Images
Aligned Image of Meloetta in the Park Image of a Park Meloetta in the Park Mask
Figure 16: Foreground Image, Background Image, and Mask (from Left to Right)
Imperfectly Blended Image Poisson Blended Image
Figure 17: Direct Blend (Left) vs Poission Blend (Right)

Pokemon Battle on the Beach

Image of Lycanroc Image of Palossand Image of a Beach
Figure 18: Input Images
Aligned Image of Lycanroc on the Beach Mask for Lycanroc Aligned Image of Palossand on the Beach Mask for Palossand Image of a Beach
Figure 19: Lycanroc Foreground Image & Mask (First Row), Palossand Foreground Image & Mask (Second Row),Background Image (Last Row)
Imperfectly Blended Image Poisson Blended Image
Figure 20: Direct Blend (Left) vs Poission Blend (Right)

Problems Encountered

One of the main issues with our least squares approach is that the solution obtained using SciPy's lsqr function does not restrict values to a fixed range. In our implementation, we used float pixel values ranging from 0.0 to 1.0, however, some of the values in the solution image "v" were less than 0.0 or greater than 1.0. Without clipping the values of "v" (setting anything less than 0.0 to 0.0 and anything greater than 1.0 to 1.0), the exported image file looked distorted (only in the masked image).

Another problem with Poisson Blending is that it does not work well when the foreground object and the background image have very different color palettes. This was not an issue for the "jet flying over the snowy mountain" example in the "Poisson Blending" section, however, the "bear in the pool" case didn't turn out so great—since the brown bear is being blended into blue waters, the bear in the poisson-blended output looks darker and bluish. The best way to fix this (if we want to use the same Poisson Blending function) is to select inputs that have similar color palettes. Alternatively, we can manually modify the colors in our input images to make the Poisson Blending result look better— the background color for the Pokemon images were manually added by selecting a color similar to the target image (though this strategy isn't always an option)

Mixed Gradients (Bells & Whistles)

Another approach to blending images is to use the gradient in the source or target image that is larger (instead of simply using the gradient in the source image as we did in regular Poisson Blending). This is helpful when we want to retain some of the gradient of the target image—like if we want to blend something with a uniform/transparent background into another image. For example, if we want to take text written on a plain background and imprint it onto a wall image, a regular Poisson Blend operation would produce the following result:

Regular Poisson Blended Image
Figure 21: Output of Regular Poisson Blend

The result using mixed gradients is shown below (the implementation simply modifies the Poisson Blending function to add a "max()" operation for selecting the larger gradient).

Mixed Gradients Image
Figure 22: Output of Poisson Blend w/ Mixed Gradients

Clearly, this is much better than the result of the regular Poisson Blending operation. This justifies the need to use mixed gradients for cases like the one discussed.