Assignment #1 - Colorizing the Prokudin-Gorskii Photo Collection

Submitted by Vivek Aswal on 02/16/2022

Objective:

This homework attempts to first focus on aligning individual single channel glass-plate images for constructing an RGB Image. Next, with the help of image pyramids, we can easily build hi-res color images. Also, different image processing techniques are encouraged to eliminate different visual artifacts.

Brief Overview:

For low-res images (cathedral.jpg), it is convenient to use single-scale NCC as the metric to align the color channels. NCC is prefered since it normalizes the different brightness intensity values across the different color channels. We start by looking in a small window of (-15, 15) pixels in both x and y directions.

Single Scale Color Channel Alignment

However, this exhaustive search over a larger window for hi-res images will be too computationally expensive which motivates the use of Image-Pyramids. Image pyramids first downsample the image to the coarest low-res and then over the window, we initialize with some coarse aligment estimates. As we continue to upsample the image, we fine-tune our aligment estimates thus, saving the computational cost.

Multi-Scale Color Channel Alignment

Bells and Whistles:

Pytorch Implementation

Changed the numpy cpu run libraries to PyTorch tensors that support gpu based fast and efficient parallelized implementation. This reduced the running time from ~10 mins to ~1 min.

Automatic Cropping

Exploited the use of NCC metric along each row/column to figure out the correlation (dot products) that need to be removed having low values i.e., these rows/cols would not be well aligned for the individual channels of the color image. However, the NCC values of black/white regions is 1 and other color areas is low. Therefore, removed the abnormal NCC rows/cols having values >0.9999 and <0.85. For removing these borders I considered the most inside row/col after which there were atleast 50 row/cols having good realistic NCC which was done for both the R and G channel w.r.t. the B channel. Finally, by taking the intersection of the tightest region between the regions found by 2 NCC metrics (corresponding to the R & G channels), the color image was cropped.

The NCC based procedure does a pretty decent job overall but the NCC cutoffs are data sensitive. It works well for a small number of images but a major drawback could potentially be leaking of more color edges for a large number of images or edges that have a more gradual blending effect. An alternative approach could be with the help of gradient filters that would help to remove the edges along the borders of the image.

Automatic Contrasting

Implemented Automatic Contrasting by using the simple idea to subtract the minimum pixel value and get the minimum intensity as 0, then dividing by the maximum pixel value to get the max pixel intensity as 1. For more drastic non-linear contrast stretching, used gamma correction that uses the square power law transformation to increase the pixel contrast. Below image shows a gamma correction value of 1.3. The images on left are the original cropped images and on right are the images with automatic contrasting.

16726-Learning based Image Synthesis