Project 4: [Auto] Stitching Photo Mosaics

This project focuses on image warping and mosaicing by creating a mosaic of two or more photographs through registration, projective warping, resampling, and compositing. The following steps were followed to achieve this:

Part 4A: Image Warping and Mosaicing

Part 1: Shooting and Digitizing Pictures

I captured multiple images of Roaster's Coffee Pack, the outside of my house, and my desk, ensuring significant overlap between the images to facilitate mosaicing. Below are some of the original images used for this project:

Part 2: Recovering Homographies

Next, I selected corresponding points between the two images. Using these points, I computed the homography matrix to warp one image onto the other. Below are the images with the corresponding points highlighted:

Part 3 & 4: Warping and Rectification the Images

Using the homography matrix, I warped both images to align them. Before blending, I performed rectification by transforming one of the images to ensure alignment. The images below show the warping and rectification process:

More examples:

Part 5: Blending into a Mosaic

Finally, I blended the images together using feathering and weighted averaging to produce the mosaic. The result is displayed below:

Part 4B: Feature Detection, Description, and Matching

Steps 1-3: ANMS, Feature Descriptor Extraction, and Feature Matching

  1. Step 1: Adaptive Non-Maximal Suppression (ANMS)

    To refine the Harris corners detected in each image, I implemented Adaptive Non-Maximal Suppression (ANMS) based on Section 3 of the paper “Multi-Image Matching using Multi-Scale Oriented Patches” by Brown et al. ANMS selects high-confidence features while ensuring they are spatially distributed, which is critical for effective matching.
    The ANMS process involves:

    • Sorting corners by their Harris response strength.
    • Assigning a suppression radius to each corner, representing the distance to the nearest stronger corner based on a robustness threshold (`robust_thresh=0.9`).
    • Selecting the top 1000 corners with the largest radii, ensuring spatial spread across the image.
    Below is a figure showing the ANMS-selected corners overlaid on the image.

    Harris Corners
    Harris Corners
    ANMS-selected Corners
    Selected Corners after ANMS
  2. Step 2: Feature Descriptor Extraction

    For each corner selected by ANMS, I extracted a feature descriptor by taking an 8x8 patch from a larger 40x40 window around each corner. This larger window allows for downsampling, which produces a blurred, stable descriptor and improves robustness.
    The descriptors are bias/gain-normalized to have zero mean and unit variance, which makes them more resistant to lighting variations. Although rotation invariance was not included for simplicity, the 8x8 patches provide effective feature representation for matching.

    ANMS-selected Corners
    Selected Corners after ANMS
    40x40 Window
    40x40 Window
    8x8 Detector for Feature Descriptor
    8x8 Descriptor
  3. Step 3: Feature Matching using Lowe’s Ratio Test

    Using the feature descriptors from Step 2, I matched features between images by applying Lowe's ratio test. For each descriptor in the first image, I identified the two closest descriptors in the second image. A match is retained if the ratio of distances between the closest and second-closest neighbors is below 0.8. This threshold helps filter out ambiguous matches, as incorrect matches generally have closer distances to multiple descriptors.
    This step significantly reduces mismatches, as demonstrated in the figure below showing matched features between images.

    Matched Features
    Matched Features after Lowe’s Ratio Test

Step 4: Robust Homography Estimation with RANSAC

To achieve accurate alignment between images based on matched features, I implemented a 4-point RANSAC (Random Sample Consensus) algorithm to compute a robust homography estimate.

RANSAC iteratively selects four matched point pairs at random and computes a candidate homography matrix using these four points. This process ensures the homography estimate remains unaffected by potential outliers.

Using these four correspondences, I compute the homography matrix by setting up a system of linear equations based on the Direct Linear Transformation (DLT) algorithm. Given points (x, y) in the source image and corresponding points (x', y') in the destination image, we form equations to solve for the transformation matrix:

Once the homography is computed, it’s evaluated based on the number of inliers, which are the matches where the transformed point falls within a specified distance threshold (e.g., 200 pixels). By iterating this process 1000 times, RANSAC identifies the homography matrix with the highest inlier count.

This final homography with the most inliers effectively aligns the images by filtering out outlier matches, which is essential for seamless mosaic creation. The figure below demonstrates inlier matches identified after RANSAC, confirming the robust alignment achieved.

RANSAC Inliers
Inliers Identified by RANSAC (69 correspondence)

Here are more examples:

Reflection: What I Learned

The most exciting part was implementing the Adaptive Non-Maximal Suppression and seeing how it improves feature spread, making matches more reliable. Also, Lowe’s ratio test for feature matching was a cool tool that helped isolate accurate feature correspondences.