In this project, I worked with a collection of historical grayscale images taken by Sergei Mikhailovich Prokudin-Gorskii. Each image consists of three separate channels (Red, Green, and Blue) that were captured individually and need to be aligned to reconstruct a full color image. Since these channels may have undergone transformations such as translation or slight distortions, the challenge is to accurately overlay them while minimizing artifacts like color fringing. To achieve this, I implemented an automated alignment pipeline that aligns the Green and Red channels to the Blue channel using image similarity metrics. For small images, I performed an exhaustive search within a limited displacement window, while for larger images, I used a multi resolution image pyramid approach to efficiently refine alignment at different scales. Additionally, I incorporated adaptive cropping to remove unwanted borders that could interfere with alignment, to get a cleaner final result. The end goal was to produce a well registered, visually accurate color image while handling variations in exposure, contrast, and misalignment, lets begin.
For low resolution images, I implemented a single scale exhaustive search to align the green and red channels to the blue channel. In this approach we assume that the primary misalignment is due to pure translation, without considering rotations or other distortions.
1️⃣ Exhaustive Search over a Fixed Window:
To align the images, I iterated over a window of [-15, 15] pixels in both horizontal and vertical directions, evaluating 961 possible shifts (2 * 15 + 1)^2 = 961 for each channel.
2️⃣ Using NCC for Matching: The best alignment was chosen based on the Normalized Cross Correlation (NCC) metric, which upon visual inspection I found to be more robust than Sum of Squared Differences (SSD), especially when scaling to high resolution images.
3️⃣ Adaptive Cropping:
Some images, such as village.tif, had misaligned borders that disrupted the NCC calculation. To address this, I cropped out 10% of the smallest image dimension from all sides before computing alignment. This ensured only the central portion of the image was used for computing NCC, it made the alignment a lot more accurate.
4️⃣ Aligning Green and Red to Blue: Once preprocessed, I aligned Green (G) and Red (R) to Blue (B) using the exhaustive search method.
This method worked well for small images, but for larger images, it was too slow. To address this, I implemented image pyramids to speed up alignment without sacrificing accuracy.
Image: Cathedral
Green Offset: (5, 2)
Red Offset: (12, 3)
Image: village without cropping
Green Offset: (12, 65)
Red Offset: (26, 75)
Image: village with cropping
Green Offset: (12, 64)
Red Offset: (22, 137)
For high resolution images, we perform alignment across multiple scales, instead of aligning full resolution images directly, we first align lower resolution versions and progressively refine the results at higher resolutions.
1️⃣ Multi-Scale Alignment Approach:
Each pyramid level consists of the image scaled down by a factor of 2. We start with the smallest resolution (coarsest level) and progressively work towards the original resolution (finest level). The displacement computed at each level is scaled up as we move to higher resolutions.
2️⃣ Optimizing the Search Window:
• At coarser levels (smallest images), we use a larger search window of [-15, 15] pixels to find an approximate alignment.
• As we refine the results, the search window is reduced to [-5, 5] pixels, which makes it a lot faster.
3️⃣ NCC for Robust Alignment:
Similar to our single scale approach, we use NCC instead of SSD, as it is more robust to variations in brightness.
4️⃣ Adaptive Cropping:
Some images, such as village.tif, had misaligned borders that disrupted the NCC calculation. To address this, I cropped out 10% of the smallest image dimension from all sides before computing alignment. Please refer to the aforementioned village example.
4️⃣ Aligning Green and Red to Blue: Once preprocessed, I aligned Green (G) and Red (R) to Blue (B) using the image pyramid approach.
Image: Emir
Green Offset: (24, 49)
Red Offset: (55, 103)
Image: Icon
Green Offset: (17, 41)
Red Offset: (23, 89)
Image: Train
Green Offset: (5, 42)
Red Offset: (32, 87)
Image: Lady
Green Offset: (9, 51)
Red Offset: (11, 112)
Image: Harvesters
Green Offset: (16, 59)
Red Offset: (13, 124)
Image: Three Generations
Green Offset: (14, 53)
Red Offset: (11, 112)
Image: Turkmen
Green Offset: (21, 56)
Red Offset: (28, 116)
Image: Village
Green Offset: (12, 64)
Red Offset: (22, 137)
Image: V oranzherei︠e︡
Green Offset: (28, 59)
Red Offset: (34, 126)
Image: V imi︠e︡nīi. Danīi︠a︡
Green Offset: (-12, 67)
Red Offset: (-28, 118)
Image: Gruppa. (I︠A︡ s dvumi︠a︡, Murman)
Green Offset: (8, 33)
Red Offset: (15, 113)
Image: V Alupki︠e︡. Krym
Green Offset: (-11, 33)
Red Offset: (-27, 140)
Instead of traditional intensity rescaling, which maps the darkest pixel to zero and the brightest pixel to one, I applied a sine based contrast adjustment. This method enhances contrast in a non linear fashion by transforming pixel intensities using a sine function. The transformation amplifies midtones while keeping extreme values within the valid range, producing richer, more visually appealing colors. I found this approach to be more effective in countering the pale appearance of the images without introducing harsh artifacts.
Image Before: Lady
Green Offset: (9, 51)
Red Offset: (11, 112)
Image After: Lady
Green Offset: (9, 51)
Red Offset: (11, 112)
Image Before: Icon
Green Offset: (17, 41)
Red Offset: (23, 89)
Image After: Icon
Green Offset: (17, 41)
Red Offset: (23, 89)
Image Before: Three Generations
Green Offset: (14, 53)
Red Offset: (11, 112)
Image After: Three Generations
Green Offset: (14, 53)
Red Offset: (11, 112)
Image Before: Harvesters
Green Offset: (16, 59)
Red Offset: (13, 124)
Image After: Harvesters
Green Offset: (16, 59)
Red Offset: (13, 124)
Image Before: Emir
Green Offset: (24, 49)
Red Offset: (55, 103)
Image After: Emir
Green Offset: (24, 49)
Red Offset: (55, 103)
I implemented a better color mapping technique to address the assumption that the red, green, and blue photographic plates used by Prokudin-Gorskii correspond directly to the modern RGB color space.
Since the spectral sensitivities of these plates likely differed, a direct mapping often led to inaccurate color reproduction. To improve this, I applied a transformation matrix that adjusted the intensity relationships between the channels, this created a more realistic color representation. It helped balance the colors and reduced the need for additional white balancing.
Initially, the transformation matrix introduced an unintended red tint to the images, which made them appear overly warm. This issue arose due to the slight overcorrection in the red channel, making it dominant in the final composition.
Shown in the train figure below:
Image Before: Train
Green Offset: (5, 42)
Red Offset: (32, 87)
Image Before: Train
Green Offset: (5, 42)
Red Offset: (32, 87)
Image Before: Train
Green Offset: (5, 42)
Red Offset: (32, 87)
Image Before: Train
Green Offset: (5, 42)
Red Offset: (32, 87)
Image Before: Lady
Green Offset: (9, 51)
Red Offset: (11, 112)
Image After: Lady
Green Offset: (9, 51)
Red Offset: (11, 112)
Image Before: Icon
Green Offset: (17, 41)
Red Offset: (23, 89)
Image After: Icon
Green Offset: (17, 41)
Red Offset: (23, 89)
Image Before: Three Generations
Green Offset: (14, 53)
Red Offset: (11, 112)
Image After: Three Generations
Green Offset: (14, 53)
Red Offset: (11, 112)
Image Before: Harvesters
Green Offset: (16, 59)
Red Offset: (13, 124)
Image After: Harvesters
Green Offset: (16, 59)
Red Offset: (13, 124)
Image Before: Emir
Green Offset: (24, 49)
Red Offset: (55, 103)
Image After: Emir
Green Offset: (24, 49)
Red Offset: (55, 103)