16-726: Assignment #1 - Colorizing the Prokudin-Gorskii Photo Collection
Rawal Khirodkar
1. Sequential Image Alignment: I used central crop to manually discard the borders (they are trouble!), about 70% central region of the image is used. I found Normalized Cross Correlation (NCC) to perform better than Sum of Squared Differences (SSD) distance. For offset computation, the search is performed within the [-15, 15] window, aligning the red or green channel image to blue channel image. I used a sequential implementation to compute the distances, this can be parallelized for further speedup. Here is the result on the low resolution cathedral.jpg

Cathedral
R: [12 3], G: [5 2]
This sequential implementation takes forever on the high resolution '.tif' images which motivates the next section.

2. Image Pyramid for Image Alignment: I used a 4 leveled image pyramid to significantly decrease the search space. The pyramid is constructed by downsampling both the target and source images by a factor of 2. At each level of the pyramid, we use the sequential implementation described above to compute the best offset with a search window. The image alignment is performed in a [-15, 15] window at the lowest level and within a [-5, 5] window at all other levels. Please find the results below.

Failure cases: This method (without any bells & whistles) fails to align two images (Emir and Village) but successfully aligns all other images. This is because in these cases the images taken through the red filter are much brighter than those taken through the green and blue filters. For these images the green and blue channels align fairly well as both have similar pixel intensities, however, the unaligned red channel creates artifacts in the results. This is fixed in the Bells & Whistles (Extra Credit) section 3 using edge detection and image normalization.

Emir
R: [-172 -46], G: [48 24]

Harvesters
R: [124 14], G: [59 17]

Icon
R: [90 23], G: [41 18]

Lady
R: [97 12], G: [44 8]

Self Portrait
R: [175 37], G: [78 29]

Three Generations
R: [108 12], G: [46 14]

Train
R: [86 32], G: [42 7]

Turkmen
R: [117 29], G: [56 22]

Village
R: [0 -80], G: [0 -10]

Additional Results: Here are some additional results from the Library of Congress archives.

School in the village of Pidma
R: [63 10], G: [26 12]

Island of Capri
R: [101 -11], G: [43 -13]

Wall Painting
R: [-6 26], G: [-20 12]

Blast Furnace
R: [144 35], G: [64 22]
3. Bells and Whistles (Extra Credit):

a) Pytorch implementation: The code is converted to pytorch.

b) Automatic cropping: I used a vertical and horiztonal sobel edge detector along with probabilistic hough transform to automatically detect and crop the borders to improve alignment. This helped fix one of the previous failure case.

Village (Before)
R: [0 -80], G: [0 -10]

Village (After)
R: [137 21], G: [65 11]
b) Automatic contrast: Rescaling image intensities, setting minimum intensity to 0 and maximum intensity to 1. This resulted in preserving finer details of the images, however not much difference overall.

Turkmen
R: [117 29], G: [56 22]

Turkmen
R: [117 29], G: [57 22]
c) Better features: I used detected edges from the canny edge detector as features instead of raw pixel intensities for alignment. They improved all the previous failure cases.

Emir

Emir Edges

Emir (Before)
R: [-172 -46], G: [48 24]

Emir (After)
R: [107 40], G: [49 24]
d) Histogram Equalization: I implemented histogram equalization of the image for generating more natural looking images. Previous outputs either had a strong red, green or blue tint due to the brightness of one of the channel. This is fixed using histogram equalization.

Monastry (Before)

Monastry (After)