Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sérgio Agostinho

A Guide to Structureless Visual Localization

Apr 24, 2025

Vojtech Panek, Qunjie Zhou, Yaqing Ding, Sérgio Agostinho, Zuzana Kukelova, Torsten Sattler, Laura Leal-Taixé

Abstract:Visual localization algorithms, i.e., methods that estimate the camera pose of a query image in a known scene, are core components of many applications, including self-driving cars and augmented / mixed reality systems. State-of-the-art visual localization algorithms are structure-based, i.e., they store a 3D model of the scene and use 2D-3D correspondences between the query image and 3D points in the model for camera pose estimation. While such approaches are highly accurate, they are also rather inflexible when it comes to adjusting the underlying 3D model after changes in the scene. Structureless localization approaches represent the scene as a database of images with known poses and thus offer a much more flexible representation that can be easily updated by adding or removing images. Although there is a large amount of literature on structure-based approaches, there is significantly less work on structureless methods. Hence, this paper is dedicated to providing the, to the best of our knowledge, first comprehensive discussion and comparison of structureless methods. Extensive experiments show that approaches that use a higher degree of classical geometric reasoning generally achieve higher pose accuracy. In particular, approaches based on classical absolute or semi-generalized relative pose estimation outperform very recent methods based on pose regression by a wide margin. Compared with state-of-the-art structure-based approaches, the flexibility of structureless methods comes at the cost of (slightly) lower pose accuracy, indicating an interesting direction for future work.

Via

Access Paper or Ask Questions

Light3R-SfM: Towards Feed-forward Structure-from-Motion

Jan 24, 2025

Sven Elflein, Qunjie Zhou, Sérgio Agostinho, Laura Leal-Taixé

Figure 1 for Light3R-SfM: Towards Feed-forward Structure-from-Motion

Figure 2 for Light3R-SfM: Towards Feed-forward Structure-from-Motion

Figure 3 for Light3R-SfM: Towards Feed-forward Structure-from-Motion

Figure 4 for Light3R-SfM: Towards Feed-forward Structure-from-Motion

Abstract:We present Light3R-SfM, a feed-forward, end-to-end learnable framework for efficient large-scale Structure-from-Motion (SfM) from unconstrained image collections. Unlike existing SfM solutions that rely on costly matching and global optimization to achieve accurate 3D reconstructions, Light3R-SfM addresses this limitation through a novel latent global alignment module. This module replaces traditional global optimization with a learnable attention mechanism, effectively capturing multi-view constraints across images for robust and precise camera pose estimation. Light3R-SfM constructs a sparse scene graph via retrieval-score-guided shortest path tree to dramatically reduce memory usage and computational overhead compared to the naive approach. Extensive experiments demonstrate that Light3R-SfM achieves competitive accuracy while significantly reducing runtime, making it ideal for 3D reconstruction tasks in real-world applications with a runtime constraint. This work pioneers a data-driven, feed-forward SfM approach, paving the way toward scalable, accurate, and efficient 3D reconstruction in the wild.

Via

Access Paper or Ask Questions

(Just) A Spoonful of Refinements Helps the Registration Error Go Down

Aug 06, 2021

Sérgio Agostinho, Aljoša Ošep, Alessio Del Bue, Laura Leal-Taixé

Figure 1 for (Just) A Spoonful of Refinements Helps the Registration Error Go Down

Figure 2 for (Just) A Spoonful of Refinements Helps the Registration Error Go Down

Figure 3 for (Just) A Spoonful of Refinements Helps the Registration Error Go Down

Figure 4 for (Just) A Spoonful of Refinements Helps the Registration Error Go Down

Abstract:We tackle data-driven 3D point cloud registration. Given point correspondences, the standard Kabsch algorithm provides an optimal rotation estimate. This allows to train registration models in an end-to-end manner by differentiating the SVD operation. However, given the initial rotation estimate supplied by Kabsch, we show we can improve point correspondence learning during model training by extending the original optimization problem. In particular, we linearize the governing constraints of the rotation matrix and solve the resulting linear system of equations. We then iteratively produce new solutions by updating the initial estimate. Our experiments show that, by plugging our differentiable layer to existing learning-based registration methods, we improve the correspondence matching quality. This yields up to a 7% decrease in rotation error for correspondence-based data-driven registration methods.

* ICCV 2021 (Oral)

Via

Access Paper or Ask Questions

CvxPnPL: A Unified Convex Solution to the Absolute Pose Estimation Problem from Point and Line Correspondences

Aug 09, 2019

Sérgio Agostinho, João Gomes, Alessio Del Bue

Figure 1 for CvxPnPL: A Unified Convex Solution to the Absolute Pose Estimation Problem from Point and Line Correspondences

Figure 2 for CvxPnPL: A Unified Convex Solution to the Absolute Pose Estimation Problem from Point and Line Correspondences

Figure 3 for CvxPnPL: A Unified Convex Solution to the Absolute Pose Estimation Problem from Point and Line Correspondences

Figure 4 for CvxPnPL: A Unified Convex Solution to the Absolute Pose Estimation Problem from Point and Line Correspondences

Abstract:We present a new convex method to estimate 3D pose from mixed combinations of 2D-3D point and line correspondences, the Perspective-n-Points-and-Lines problem (PnPL). We merge the contributions of each point and line into a unified Quadratic Constrained Quadratic Problem (QCQP) and then relax it into a Semi Definite Program (SDP) through Shor's relaxation. This makes it possible to gracefully handle mixed configurations of points and lines. Furthermore, the proposed relaxation allows us to recover a finite number of solutions under ambiguous configurations. In such cases, the 3D pose candidates are found by further enforcing geometric constraints on the solution space and then retrieving such poses from the intersections of multiple quadrics. Experiments provide results in line with the best performing state of the art methods while providing the flexibility of solving for an arbitrary number of points and lines.

* Main paper and supplemental material included. References added and minor change to fig 1

Via

Access Paper or Ask Questions