Abstract:This paper presents a novel camera relocalization method, STDLoc, which leverages Feature Gaussian as scene representation. STDLoc is a full relocalization pipeline that can achieve accurate relocalization without relying on any pose prior. Unlike previous coarse-to-fine localization methods that require image retrieval first and then feature matching, we propose a novel sparse-to-dense localization paradigm. Based on this scene representation, we introduce a novel matching-oriented Gaussian sampling strategy and a scene-specific detector to achieve efficient and robust initial pose estimation. Furthermore, based on the initial localization results, we align the query feature map to the Gaussian feature field by dense feature matching to enable accurate localization. The experiments on indoor and outdoor datasets show that STDLoc outperforms current state-of-the-art localization methods in terms of localization accuracy and recall.
Abstract:Feature matching is an essential step in visual localization, where the accuracy of camera pose is mainly determined by the established 2D-3D correspondence. Due to the noise, solving the camera pose accurately requires a sufficient number of well-distributed 2D-3D correspondences. Existing 2D-3D feature matching is typically achieved by finding the nearest neighbors in the feature space, and then removing the outliers by some hand-crafted heuristics. However, this may lead to a large number of potentially true matches being missed or the established correct matches being filtered out. In this work, we introduce a novel 2D-3D matching method, Geometry-Aided Matching (GAM), which uses both appearance information and geometric context to improve 2D-3D feature matching. GAM can greatly strengthen the recall of 2D-3D matches while maintaining high precision. We insert GAM into a hierarchical visual localization pipeline and show that GAM can effectively improve the robustness and accuracy of localization. Extensive experiments show that GAM can find more correct matches than hand-crafted heuristics and learning baselines. Our proposed localization method achieves state-of-the-art results on multiple visual localization datasets. Experiments on Cambridge Landmarks dataset show that our method outperforms the existing state-of-the-art methods and is six times faster than the top-performed method.