Abstract:Planar homography estimation is foundational to many computer vision problems, such as Simultaneous Localization and Mapping (SLAM) and Augmented Reality (AR). However, conditions of high variance confound even the state-of-the-art algorithms. In this report, we analyze the performance of two recently published methods using Convolutional Neural Networks (CNNs) that are meant to replace the more traditional feature-matching based approaches to the estimation of homography. Our evaluation of the CNN based methods focuses particularly on measuring the performance under conditions of significant noise, illumination shift, and occlusion. We also measure the benefits of training CNNs to varying degrees of noise. Additionally, we compare the effect of using color images instead of grayscale images for inputs to CNNs. Finally, we compare the results against baseline feature-matching based homography estimation methods using SIFT, SURF, and ORB. We find that CNNs can be trained to be more robust against noise, but at a small cost to accuracy in the noiseless case. Additionally, CNNs perform significantly better in conditions of extreme variance than their feature-matching based counterparts. With regard to color inputs, we conclude that with no change in the CNN architecture to take advantage of the additional information in the color planes, the difference in performance using color inputs or grayscale inputs is negligible. About the CNNs trained with noise-corrupted inputs, we show that training a CNN to a specific magnitude of noise leads to a "Goldilocks Zone" with regard to the noise levels where that CNN performs best.