Colorizing the Prokudin-Gorskii Photo Collection

In this assignment we need to align 3 images from B, G and R channel and obtain a colorized image. Here is a brief overview of my methods.

Overview

  1. Implement a convolution method and use sobel filter to extract edge features.
  2. Remove some window margin on the edges to simplify alignment.
  3. Align source image to target image by searching over a fixed window of translation.
  4. Use SSD to rank similarity between alignments.
  5. Use the best alignment of Green to Blue and Red to Blue and stack the images to obtain color image. I will get into more details in the subsequent code blocks.

Bells and Whistles

I mainly implemented 2 parts.

  1. Partial implementation of Pytorch to run convolution faster.
  2. Implemenation of edge features to improve alignment results.

Results

All the results are displayed below. Here is a demonstration of successful and failed alignments.

Successful Alignments

title title title title

Failed Alignments

title title

Speculations for reason of failure: The background texture is complex and has many edges, thus the alignment using edge features may be disturbed by the edges in the background.

Below is a from-scratch implementation of the convolution operation. The implementation does not invert the kernel for simplicity. This method will be used for edge detection.

This is the align function for one instance of alignment. Because the image is already normalized within range [0, 1], no normalization is involved in this method. The method first uses sobel filters on both directions to detect edges. We obtain an edge feature image with the same size where each pixel is the intensity of the edge. (Direction information is not considered here). After this operation it searches translation within a window size of 15 to compare similarity. This methods uses Sum of Squared Distance to calculate similarities and it picks up the translation parameter with least SSD.
Note that the code here uses PyTorch convolution functions. The commented section uses my implementation of convolution. My implemented version numerically works the same as the PyTorch version. However, my implementation takes very long to run on large .tif images.

Below is the pyramid alignment method. It is a very simple recursive call to the align_once function. The function searches in the range of [-15, 15] on the coarsest level and searches in the range of [-1, 1] on subsequent levels.