16-726 Assignment 1
Jason Zhang (jasonyzhang@cmu.edu)
Alignment
I began by exhaustively searching over possible displacements and scoring each
possible displacement using normalized cross-correlation (NCC).
For larger images, this quickly became computationally infeasible, so I implemented an
image pyramid to check over a smaller window at a variety of scales. This
significantly improved runtime.
For better matching, I tried the following edge detection methods:
Canny Edge Detector: Basically smooths the image with a Gaussian filter and
applies a difference filter. To reduce noise, the algorithm uses two different
thresholds: keeping pixels that are above the higher threshold and keeping
pixels that are above the lower threshold only if they are also connected to
pixels above the higher thresholds. I didn't implement this myself, just used
cv2.Canny.
The Sobel filter and Canny Edge Detector both performed reasonably well.
Below is a visualization of the different levels of the image pyramid. From
left to right: resized image, Canny edge detector applied to R, G, and B
channels respectively. The edges in the R and G channels were then compared to
the edges with the B channel using NCC.
Cropping
To crop the image to remove the borders, I first applied a Sobel filter to the
gray-scale image to get horizontal and vertical edge detectors.
The borders were picked up relatively strongly by the edge detectors.
I averaged the vertical edge detector along the horizontal dimension and the
horizontal edge detector along the vertical dimension. The indices with large
means thus correspond to the borders of the picture.
Here are some visualizations. The blue lines on the margins are the mean values
from the edge detectors (after smoothing and normalizing). The red lines are
the thresholds to crop.
Failure Modes
This method assumes the borders are axis-aligned and produce a sharp gradient.
For the most part, the borders were axis-aligned but the gray-scale versions
of the images did not always have sharp edge. For example, the cathedral's
yellow bar has a very similar luminance to the blue sky.
Perhaps this method would have worked better if applied on each channel
individually rather than on the grayscale image.
Recoloring
I began by playing around with the temperature of the color palette by simply
rescaling the channels. I think there was some improvement since indoor pictures
should have warmer light and the outdoor picture should have cooler lighting.
Overall, the effect with negligible and in many cases actually looked worse.
Then, I tried a bunch of techniques with varying success:
White World (2nd column): I rescaled the color channels such that the
brightest pixel is always (255, 255, 255). Not really noticeable difference.
Gray World (3rd column): I rescaled the R and B channels such that the average
intensity was the same across all three channels. This looked bad.
Histogram equalization (4th column): I converted images to LAB space then used
the cdf to spread the intensities of the image. Then I converted the image back
to color. This significantly increased the contrast in the image, making colors
look more saturated. However, images with both dark regions and light regions
look very unrealistic since the global contrast is too high.
CLAHE (5th column): To reduce the effects of using global contrast,
adaptive histogram equalization splits the image into tiles and applies
equalization to smaller areas of the image. This produces a result with more
realistic looking lighting throughout. I just used cv2.createCLAHE for this
part.
Offset
image | disp_r | disp_g |
cathedral | (12, -3) | (-5, -2) |
emir | (-107, -40) | (-49, -24) |
harvesters | (-124, -14) | (-60, -18) |
icon | (-90, -23) | (-39, -16) |
lady | (-120, -13) | (-56, -10) |
monastery | (-3, -2) | (3, -2) |
nativity | (-8, 0) | (-3, -1) |
self_portrait | (-175, -37) | (-77, -29) |
settlers | (-14, 1) | (-7, 0) |
three_generations | (-111, -8) | (-58, -17) |
train | (-85, -29) | (-43, -5) |
turkmen | (-118, -29) | (-57, -22) |
village | (-137, -21) | (-65, -11) |
chalice | (-4, -2) | (-1, -1) |
poles | (-6, -3) | (-2, -2)
|