16-726 SP22 Assignment #1 - Colorizing the Prokudin-Gorskii Photo Collection Report

Run


python main.py
python main.py --auto_crop

Images should be put in /data folder, output generates in /output folder. Create these two folders first.
Pytorch is required.

Goal

This project aims to take the digitized Prokudin-Gorskii glass plate images and automatically produce a color image with as few visual artifacts as possible.

Works

My assignment contains the following remarkable steps.

Preprocess

Divide the picture into three pictures, each containing a channel(Blue, green and red).
For exhaustive search, roll another picture and find the minimum NCC with the base picture. Then align all of them together.
For image pyramid, build a pyramid with reasonable levels and do an exhaustive search in on

Approach 1: Exhaustive Search

In findBestNCC function. (Not used to generated result but works)

Fix one channel picture(Blue in my implementation). Using np.roll to roll another channel image on X-axis and Y-axis within a [-15,15] range.
Since the border may contain pixels that are harmful to the result, calculate the NCC within the center.
Record the offsets(x,y) with maximum NCC (Implemented two methods at first, in report I use NCC because it is better and faster for most pictures so that SDD solution is not included in code).
Align all rolled pictures together.

Approach 2: Image Pyramid

Build image pyramid with anti-aliasing enabled. Stop when pixels are limited enough(<80 px in my implementation).
Start from the top of the pyramid. Calculate the best offset (x,y) within a small window, and double the best offset as the level goes down because the pyramid is built with times of 2.
As the pyramid goes to the bottom, the offset is calculated. Thus we only need to do pyramid_level * small_window_size^2* times of search instead of large_window_size^2 times of search.

Bells & Whistles

Alignment with gradient(2 pts). Using Sobel filter to get gradient image and use it as input in normal process. This helps solves complicated pictures when pixel value only cannot work. This also works quite well on the emir picture where NCC fails.
Pytorch implementation(2 pts). Rescale and calculate values using pytorch and turn numpy arrays into torch.Tensor. By using torch builtin function, the speed increases. It is also very convenient to covert into numpy.
Auto-cropping(2 pts). Using gradients as the edge detection tool. Sum up every row/column gradient value and find the row/column with maximum sum within a range. Very few images fail to work well on few sides because white border cropping is not implemented and it exceeds the range. For most of the images it works quite well.

Results

Aligned	Aligned(Cropped)	Offsets
		G2,5 R3,12
		G23,49 R40,107
		G17,60 R13,123
		G17,42 R23,90
		G9,56 R13,120
		G29,78 R37,176
		G12,54 R8,111
		G1,41 R29,85
		G22,57 R28,117
		G10,64 R21,137

Homepage