The main objective of this project was to take black and white glass plate images, captured by Russian photographer Sergei Mikhailovich Prokudin-Gorskii, align and merge them to create color images. In the main part, I will illustrate how I align the images and show some results. In Bells and Whistles, I will desribe the techniques that I employed to enhance the image quality.
|
For figures with relatively small size, we can align the images using a brute-force strategy. We check all possible translation vector [dy, dx] where dy and dx are integer between -20 and 20. (-15 to 15 is not precise enough for some figure like sell_portrait.tif) We can use two simple metrics to determine if a translation vector is optimal, sum of squared distance (SSD) and normalized cross-correlation (NCC).
SSD:$$ \sum_i\sum_j((I^{(1)}_{i,j}-I_{mean}^{(1)})- (I^{(2)}_{i,j}-I_{mean}^{(2)}))^2 $$ NCC:$$\frac{\sum_i\sum_j(I^{(1)}_{i,j}-I_{mean}^{(1)})\cdot (I^{(2)}_{i,j}-I_{mean}^{(2)})} {\sqrt{\sum_i\sum_j(I^{(1)}_{i,j}-I_{mean}^{(1)})^2\cdot (I^{(2)}_{i,j}-I_{mean}^{(2)})^2}} $$
In addition, before comparing the images to be aligned, I mannually cropped
a specific percent of the boundary (10%), i.e., only a specific part of the image is taken into consideration.
This noticebly improve the performance as it rules out most of the influence from image border.
For large images, I resize the image into a smaller one with half width and height and
do this recursively. Then I stacked the resized images as a pyramid and search optimal translation
vector from top (smallest image) to bottom (largest). After getting the optimal translation vector
[dy, dx] from n-th layer, for image at n+1-th layer, we only need to iterate through [2*dy-3, 2*dy+3]
to search for new optimal dy and similar for dx. This significantly reduces computational costs. Here,
I set the height of pyramid to 4.
Below are some successful colorized cases.
The translation vector of each channel are arranged as [dy, dx], where dy indicates displacement on vertical
axis and dx is on the horizontal axis.
|
|
|
|
|
|
Below are results of other images provided in the data folder, using NCC as metric.
|
|
|
|
|
|
|
|
|
|
|
In general, using either SSD or NCC will perform well on most images and they will give similar results. However, for images like Emir.tif, pixels in different channels have very different intensities distribution (different brightness). Thus SSD and NCC will not work well at those cases. Rather than using more advanced image alignment methods like keypoint matching based on some feature descriptor, a straighforward way would be measuring the gradient difference between two images, which is defined as: $$\sum_i\sum_j(I^{(1)}_x[i,j]-I^{(2)}_x[i,j])^2+ (I^{(1)}_y[i,j]-I^{(2)}_y[i,j])^2$$
|
|
It can be seen that after applying gradient difference as metric of alignment, the performance is significantly improved.
In the main part, I mannually define a ratio of image to consider to rule out the influence of image border. Instead of mannual cropping,
notice that most borders of the images have pxiel intensities different from the main part of the images, we can come up with some automatic
cropping strategy. To this end, I designed a auto cropping pipeline as below:
1. Set all the white pixels on the image boundary to black (I[I>250] = 0). This can make the image border looks more consistent (mostly black).
2. Apply Otsu threshold to the image. The intuition behind Otsu's method is finding two classes of pixels by maximizing the inter-class variance.
Given that the noisy stuff on the border are very different from image content, this threshold method can help us identify which region to crop.
3. After thresholding, apply binary erosion to the image, this can filter out tiny blob and noisy points.
4. Find the leftmost column that has noticeble amount of pixels that are identified as second kind of pixel, marked it as the left boundary of
cropped image (i.e. I_new = I[:, leftmost:]). Similarly, apply this to the rest three boundaries (right, top, bottom). Here I also limit the auto
cropping function to crop no more than 0.4 percent of the original image, which help prevent some extreme cases from failure.
|
|
|
|
|
|
Contrasting generally means the intensity of pixels are distributing more evenly. Hence, if we want to make an image more contrast, we can achieve this by stretching out its pixel intensity distribution. A straightforward way would be histogram equalization. Given an image, histogram equalization improves its contrast by stretching out its original historgram distribution such that histograms will be distributed to all levels of pixel intensities (from 0 to 255).
|
|
|
|
|
|
In this project, I mainly tested two ways to do the auto white balancing.
Simple white balancing: The first method is quite simple,
firstly we convert RGB image into grayscale, then we pick the brightest pixel [ix, iy], calculate its difference with 255
on R, G, B channels, [dR, dG, dB]. Then we add [dR, dG, dB] to the whole image. After addition, we normalize the image such
that it only ranges from 0 to 255.
Robust white balancing: The second method is more complicated. The technique is proposed by Huo et al.[3]. The Algorithm
estimates illuminant based on pixels' YUV coordinates, i.e. finding pixels that are close to gray threshold T: \((|U|+|V|)/Y < T\).
Then we can use iterative method to adapt the gain on channels R and B given the estimated illuminant. Once U and V values are
closed enough to neutral gray, we stop the searching.
(Note that here noisy stuff on the boundary will severely harm the performance of white balancing, thus the images must be cropped first)
|
|
|
|
|
|
|
|
|
|
|
|
In general, we can observe that after white balancing, the image becomes colder and more gray. Simple white balancing strategy works for some of the cases (self_portrait), but generally performs worse than Robust White Balancing algorithms.
[1] https://en.wikipedia.org/wiki/Otsu%27s_method
[2] https://docs.opencv.org/3.4/d4/d1b/tutorial_histogram_equalization.html
[3] J. Huo, Y. Chang, J. Wang, and X. Wei, "Robust Automatic White Balance Algorithm using Gray Color Points in Images," 2006, pp. 541-546.
[4] http://web.stanford.edu/~sujason/ColorBalancing/robustawb.html