Prokudin-Gorskii Image Alignment

Sudeep Dasari

Andrew ID: sdasari

Overview and Approach

Our primary objective was to align color channels from Sergey Prokudin-Gorsky's prototype color camera system! To form a color image, Prokudin-Gorsky shot three seperate grey-scale images from cameras with Red, Green, and Blue color filters. Our mission is to align these channels - using simple distance metrics like Normalized Cross Correlation (NCC) - in order to form a single final color image that can be cleanly displayed.

Of course the raw digitized sensor readings are very high resolution, which makes computing this metric across all possible shifts very intractable. Instead, a coarse-to-fine estimation scheme is used. The image is iteratively downsampled and the entire set of (original and rescaled) images are assembled into a pyramid. An optimal translation (that maximizes our NCC metric) is solved by densely searching shifts at the coarsest scale. That translation is used as an initialization point to solve for optimal shifts at the finer scale, while using iteratively fewer translations at the larger image scales. This divide and conquer algorithmic approach provides a big runtime win over naiive exhaustive search.

Algorithm

The final align algorithm is incredibly easy to summarize with pseudocode. This procedure is repeated twice to align red and green channels to blue. Note that the channels are cropped before alignment (chop off 10% of each side of image) to prevent noise from the border from messing up the metric. This procedure could be automated to prevent chopping important image signals. As an additional optimization I mean-center the channels before computing NCC.

Once aligned the red, green, and blue channels are assembled into a single RGB image and saved as a JPG. The code is vectorized and parallelized to acheive runtime of around 1 minute.


Pythonic pseudocode for the final alignment algorithm alongside default parameters.

Sanity Check: Brute Force Alignment

I first sanity check the algorithm by doing a brute force alignment (i.e. densely searching through shifts) for the small cathedral image provided with the starter code. The goal here is not to advocate for this method, but rather to check that brute force search can work in certain conditions (e.g. small images).

This can be easily replicated in my code by setting levels=0 and search_range=50. Alongside the final aligned image I also report the shift as [h, w] array.


Generated on cathedral.jpg. Red shift is [12, 3] and green is [5, 2]

Website template graciously stolen from here

Alignment on Provided Images

I now align all the provided tif images. I visualize the final generated result and report the shift array using same format as before.

Notice that most images have artifacts on the borders. This is expected since the raw data had mis-aligned borders and I don't bother cropping them for this visualization. Other than that obvious (expected) border defect, the alignments are all high quality. I did notice some slight mis-alignment on heads in the emir and harvester images, though this is non-obvious until you zoom in. I believe the emir case is a genuine mis-alignment (probably because of how different values are in R/G/B channels for that image), but I believe the artifacts in harvester occured because the girls moved their heads while the pictures were being taken. Otherwise, I don't see errors other than some speckle noise.


emir.tif. Red shift: [103, 57], Green shift: [49, 24]						harvesters.tif. Red shift: [124, 14], Green shift: [60, 17]

icon.tif. Red shift: [89, 23], Green shift: [41, 17]						lady.tif. Red shift: [117, 12], Green shift: [55, 9]

self_portrait.tif. Red shift: [176, 37], Green shift: [79, 29]						village.tif. Red shift: [138, 22], Green shift: [65, 12]

train.tif. Red shift: [87, 32], Green shift: [42, 6]						turkmen.tif. Red shift: [116, 28], Green shift: [56, 21]

three_generations.tif. Red shift: [111, 11], Green shift: [53, 14]

Alignment on Other Images from LoC

Finally, I downloaded some extra image panes from the Library of Congress website, align them with my algorithm, and display them below. Browsing through the full collection, I was struck by the incredible breadth and quality of images Prokudin-Gorskii took given the limited resources he had. I'm also pleased with my algorithm's performance on unseen images :)

Kamnecherpatel nai a mashina v kanali e

Di e vushka s zemli a niko .

Na rubki︠e︡ parokhoda "Sheksna" M.P.S.