template taken from HTML5 Webtemplates.co.uk

16726 Project 1

Zhipeng Bao (zbao)

Overview

This project aims to us image processing techniques to automatic align the digitized Prokudin-Gorskii glass plate images. For this alignment, I first apply the recommended SSD based alignment to align the single-scale image. Then, I adapt it to large-scales by using multiscale pyramid. For the Bells and Whistles, I apply three new algorithms: (1) I use the edge features to do the alignment rather than the raw image; (2) I apply auto-cropping to the aligned images; (3) I propose a new alignment algorithm based on histograms.

All the code is in the main_hw1.py.


How to run the code

1. select and change the image name index (0~9);

2. find the region for each question and uncomment those part;

3. run the code by python3 main_hw1.py.


Q1: Single-scale Image

I use SSD as the metric to do the alignment. I select the central 90% of the whole image to do the alignment to avoid the impact of the borders. See function align(a,b) for details.

Results

Original Image.
Aligned Image. Offset: G[1,-1] R[7,-1].

Q2: Multi-Scale TIF images

For the large TIF images, I use the recommended pyramid version. I first rescale the image half and half until the whole pixel is smaller than 256 X 256. Then I find the offset (x,y) in the initial resolution. Next, I iteratively upsample the image and search the space [2x-3~2x+3, 2y-3~2y+3] to find the best offset in the next resolution until scale index is set to 1. During the experiments, the initial images are rescale to 1/16 of the original images.

See function align_tif(a,b,scale) and align_with_scale(a,b,x_min, x_max, y_min, y_max, scale) for details.

Results on the provided images:

emir.tif
Original Image.
Aligned Image. Offset: G[49,24] R[0,-200]. Failure case: the reason is that this person wears a blue coat so the value in the blue channel and red channel is exactly reverse in those regions. To solve this problem, we can use extracted features rather than the raw images to do the alignment. I solve this issue in Q3 when applying the gradient feature to do the alignment. See that part for the good result.
harvesters.tif
Original Image.
Aligned Image. Offset: G[60,16] R[124,13].
icon.tif
Original Image.
Aligned Image. Offset: G[40,17] R[89,23].
lady.tif
Original Image.
Aligned Image. Offset: G[53,8] R[117,10].
self_portrait.tif
Original Image.
Aligned Image. Offset: G[78,28] R[176,36].
three_generations.tif
Original Image.
Aligned Image. Offset: G[54,11] R[112,9].
train.tif
Original Image.
Aligned Image. Offset: G[43,5] R[87,31].
turkmen.tif
Original Image.
Aligned Image. Offset: G[56,19] R[115,26].
village.tif
Original Image.
Aligned Image. Offset: G[64,11] R[137,21].

Results on extra images:

I downloaded 3 other tif images from the given collections. The results are shown below.

00800-00893a.tif
Original Image.
Aligned Image. Offset: G[11,17] R[35,18].
00900-00907a.tif
Original Image.
Aligned Image. Offset: G[25,0] R[58,-4].
01000-01037a.tif
Original Image.
Aligned Image. Offset: G[17,9] R[47,9].

Q3: Bells and Whistles

I add 3 additional applications based on previous algorithm.

3.1 Alignment with extracted features.

Alignment with raw images is not robust. We could first do some feature extraction for the raw images and then align with the processed images. Here I use some filters to extract the features and realign with those processed images. I provided two kinds of filters: option 1 for sobel filters and option2 for roberts filters. I list two results for each filter in this part.

See function align_filters(a,b,scale,option) for details.

Results

emir.tif
Extracted edges by sobel filter (combined rgb channels).
Aligned Image with sobel filter. Offset: G[49,23] R[107,40].
Extracted edges by roberts filter (r channels).
Aligned Image with roberts filter. Offset: G[49,24] R[107,40].
icon.tif
Extracted edges by sobel filter (combined rgb channels).
Aligned Image with sobel filter. Offset: G[42,17] R[90,23].
Extracted edges by roberts filter (r channels).
Aligned Image with roberts filter. Offset: G[41,17] R[90,23].

3.2 Alignment with histograms.

Insipred by the histogram technique, I thought about a new algorithm to do the alignment. That is, the overall pixel distribution of each row and column of the image can represent the features. If we first calculate the mean pixel value of each raw and column, then we align with these mean value, we can get good result and save the time at the same time. So I write a function to first calculate the mean pixel value of each raw and column of two channels, then I use correalation to find the offset of the two channels.

Using this algorithm, we can get the aligned results in 2 seconds, which is much faster than the original algorithm. Again I showed two results here.

See function align_histogram(a,b) for details.

Results

icon.tif
Correlation of G and B channel.
Correlation of R and B channel.
Aligned Image with histogram. Offset: G[42,16] R[91,22].
self_portrait.tif
Correlation of G and B channel.
Correlation of R and B channel.
Aligned Image with histogram. Offset: G[77,26] R[174,34].

3.3 Auto-cropping.

By observing the original images, I found all the digitized images has a black boundary + a white boundary for each channel. So to do the auto-crop, I designed a 2-step method: first remove the white boundary then remove the black. I achieve that also by calculating the mean pixel values. If the mean pixel value is smaller than 0.15, I treat it as the part of black boundary, if larger than 0.85, I treat it as the white boundary. I calculate the boundary with the aligned image and do it with r,g,b channels invidually. I treat the overall tightest boundary from the three channels as the final boundary.

I showed two results here.

See function def autocrop(image, thres1, thres2) for details.

Results

icon.tif
self_portrait.tif