Assignment #1 - Colorizing the Prokudin-Gorskii Photo Collection
Name : Divam Gupta
Andrew ID : divamg
Introduction
The aim of this homework is to take the digitized Prokudin-Gorskii glass plate images and colorize them. The different channels are not aligned very properly hence we have to use a pyramid search technique to find the optimal alignment.
Methodology
Naive Search
The easiest way to find the alignment between multiple channels is the brute force approach where we search for all possible alignments within a window. However this approach is not very feasible for very large image sizes, because we would need a relatively large window size as well.
From all possible alignments, we can chose the alignment which which matches the most. To do that one easy way it so compute the sum of squares distance. ( SSD)
SSD( im_1 , im_2) = sum( ( im1 - im2)^2 ))
Preprocessing filters
Computing the difference of between the raw RBG pixels is not a very good idea because different channels can have very varying intensities. Hence a better idea is to preprocess the image using some filters which are invariant of the colors.
Applying laplacian filter on the images and then searching usually yields much better results compared to searching on raw pixels.
Laplacian is computer by applying the following filter:
Here is an example of the laplacian filter applied:
Image Pyramid
In order to save computation time it is better to find the alignments on a lower resolution and then tune the alignment for higher resolution. The number of pyramid levels are computed as
n_levels = log(image.shape[0] , 2 ) -8
Algorithm
Here is the pseudo code of the alignment algorithm. We allign both the red and green channel with the blue channel
1) Resize image to multiple levels
previous_allignemnt = 0 , 0
For each level l :
img1 <- reference image at at level l
img2 <- target image at level l
img1 = allign( img1 , previous_allignemnt*2)
img2 = allign( img2 , previous_allignemnt*2)
pre1 = laplacian( img1 )
pre2 = laplacian( img2 )
best_allignemnt = alling( pre1 , pre2 )
previous_allignemnt = best_allignemnt
We can see the example of multi level matching one of the sample:
Level = 1
Level =2
Level = 3
Results
Here are the the no. of levels and alignment offsets for the given images.
The alignment offsets are in format [ x_green , y_green , x_red , y_red ]
data/cathedral.jpg :
No of pyramid levels 2
offsets 2 5 3 12
-------
data/emir.tif :
No of pyramid levels 5
offsets 24 49 533 -501
-------
data/three_generations.tif :
No of pyramid levels 5
offsets 13 52 9 110
-------
data/train.tif :
No of pyramid levels 5
offsets 8 42 33 86
-------
data/icon.tif :
No of pyramid levels 5
offsets 18 41 23 90
-------
data/village.tif :
No of pyramid levels 5
offsets 12 65 22 137
-------
data/self_portrait.tif :
No of pyramid levels 5
offsets 29 77 37 175
-------
data/harvesters.tif :
No of pyramid levels 5
offsets 18 60 15 124
-------
data/lady.tif :
No of pyramid levels 5
offsets 9 54 9 116
-------
data/turkmen.tif :
No of pyramid levels 5
offsets 22 57 29 117
-------
The
We can see that it only fails on the Emir image, which is because there is a big difference among different channels.
Results on other images :
data/master-pnp-prok-00400-00451a.tif :
No of pyramid levels 5
offsets -17 43 -35 102
Bells and whistles
Trying canny edge detection
I tried OpenCV canny edge detection on the Emir sample and observed that the problem goes away. The overall algorithm was kept to be the same, just laplacian pre-processing was replaced by Canny edge detection :