Assignment #2 - Gradient Domain Fusion

Overview

In this assignment, we experimented with gradient-domain processing, a technique that is used to seamlessly blend an object from a source image into a target image with a different background. Using provided starter code, we can draw a mask around the object we wish to extract, position it where we would like it to go in the target image, and then proceed by running the Poisson blending algorithm on the image.

With Poisson blending, the goal is to find the values for the pixel intensities in the target image that maximally preserve the gradient of the region clipped from the source image. Without other constraints, this may result in the source image changing color, but in this experiment we are more focused on the success of the blending techniques themselves. Each pixel can be formulated as a constraint in a massive least squares problem, where we try to minimize (Av-b)^2. A here is a sparse matrix, where the number of rows is equal to the number of constraints in the system, and the number of columns is equal to the number of pixels in the entire image. Each index in the row maps to a pixel in the flattened version of the image. Naturally, v is the vector we wish to solve for, which in this case is the pixel intensities. Lastly, b is a known vector.

The formal blending constraints can be represented by this equation:

$v = a r g m i n_{v} \sum_{i \in S, j \in N_{i} \cap S} ((v_{i} - v_{j}) - (s_{i} - s_{j}))^{2} + \sum_{i \in S, j \in N_{i} \cap \neg S} ((v_{i} - t_{j}) - (s_{i} - s_{j}))^{2} .$

Where i is a pixel in the source region "S", and j is a 4-neighbor of "i", "s" represents the pixel intensities at the source region, and "t" the pixel intensities at the target region.

Toy problem Result

The preliminary task was to reconstruct a small image using solely the gradient plus one pixel intensity of the original image. Here is the original toy problem image, followed by the reconstruction, for completeness.

Favorite blending result

Below is the source image for my favorite blending result - a field of cows. From this field, I first cropped out one of the cows.

Below is the target image, a random villa in the Dominican Republic. I wanted the final blending result to look as authentic as possible, so I tried to find a reasonably similar grass texture, and put the cow on the front lawn.

Here is the result of aligning one of the cows on the lawn in the target picture. This is with the source pixels copied directly into the target image, with no blending.

And lastly, here is the final blending result. I knew that the result of the algorithm would likely change the color of the cow, but interestingly enough, that color turned out to be brown - the only other color realistic for a cow! As a result, the final image looks surprisingly realistic.

The algorithm works by computing the solution to the least squares system defined above. Using the provided alignment starter code, we first obtain the mask for the source image. The next step in code is to define the pixel mapping given in the toy problem to easily convert between the flattened version of the image and the 2D version. Then, looping over the pixels in the 2D image, we add the constraints to the least squares problem for each 4-neighbor of every pixel where the mask is set to True. We then compute the gradient of the source image, and set the appropriate value in the b vector. The scipy.sparse lil_matrix is used for making the large sparse matrix computation far more efficient. The vector containing the solved pixel values are then reshaped back to the 2D image dimensions (unflattened) and an element-wise product is computed between the boolean values in the mask variable and the blended pixels, and added to the element-wise product of the negated mask and the target background. The computation is performed separately for each of the three color channels.

More Results

Here are some more results for Poisson blending, including the before image where the pixels are copied directly over so that it is clear which is the source and target image. Here is one of a sea monster attacking the Amalfi coast. It is amazing to me that the glimmer of the light on the monster's tail blends so well with the target image.

And another of a food mart placed randomly in the middle of a desert.

Here is one that didn't work quite as well, an image of someone walking back to their overwater bungalow in Bora-Bora. For this blending example, I deliberately tried to introduce a vastly different background texture between the source and target images, as many of the previous images had the benefit of having similar backgrounds. Interestingly, I tried to use the above image of the food mart and the desert to deliberately create a poor blending result, but the result actually appeared to look decent due to the similarity in color between the Food Mart and the desert landscape. In the Bora-Bora example, I tried to see just how far the blending success could go, and as seen below, it did not work for these vastly different textures. Instead, it just looks like the source image is somewhat dimmed and just glued onto the target. Otherwise, the blending technique works quite well in producing realistic images, aside from the possible color change in the source image.

Extra Credit

For the extra credit, I attempted the "Mixed Gradients" task, changing the least-squares problem to incorporate the gradient from the source or target image with the larger magnitude. For this task, I took a different approach - instead of blending photographs, I wanted to see the result of blending paintings. I clipped a sketch of an old car and blended it into the famous Edward Hopper painting "Gas". Here is the side-by-side comparison of directly copying the pixels vs. the mixed gradient blending method.