Abstract:This paper addresses the challenges in learning-based monocular positioning by proposing VKFPos, a novel approach that integrates Absolute Pose Regression (APR) and Relative Pose Regression (RPR) via an Extended Kalman Filter (EKF) within a variational Bayesian inference framework. Our method shows that the essential posterior probability of the monocular positioning problem can be decomposed into APR and RPR components. This decomposition is embedded in the deep learning model by predicting covariances in both APR and RPR branches, allowing them to account for associated uncertainties. These covariances enhance the loss functions and facilitate EKF integration. Experimental evaluations on both indoor and outdoor datasets show that the single-shot APR branch achieves accuracy on par with state-of-the-art methods. Furthermore, for temporal positioning, where consecutive images allow for RPR and EKF integration, VKFPos outperforms temporal APR and model-based integration methods, achieving superior accuracy.
Abstract:Robust estimation is essential in computer vision, robotics, and navigation, aiming to minimize the impact of outlier measurements for improved accuracy. We present a fast algorithm for Geman-McClure robust estimation, FracGM, leveraging fractional programming techniques. This solver reformulates the original non-convex fractional problem to a convex dual problem and a linear equation system, iteratively solving them in an alternating optimization pattern. Compared to graduated non-convexity approaches, this strategy exhibits a faster convergence rate and better outlier rejection capability. In addition, the global optimality of the proposed solver can be guaranteed under given conditions. We demonstrate the proposed FracGM solver with Wahba's rotation problem and 3-D point-cloud registration along with relaxation pre-processing and projection post-processing. Compared to state-of-the-art algorithms, when the outlier rates increase from 20\% to 80\%, FracGM shows 53\% and 88\% lower rotation and translation increases. In real-world scenarios, FracGM achieves better results in 13 out of 18 outcomes, while having a 19.43\% improvement in the computation time.