Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hongyi Fan

Multiview Image-Based Localization

Mar 30, 2025

Cameron Fiore, Hongyi Fan, Benjamin Kimia

Abstract:The image retrieval (IR) approach to image localization has distinct advantages to the 3D and the deep learning (DNN) approaches: it is seen-agnostic, simpler to implement and use, has no privacy issues, and is computationally efficient. The main drawback of this approach is relatively poor localization in both position and orientation of the query camera when compared to the competing approaches. This paper represents a hybrid approach that stores only image features in the database like some IR methods, but relies on a latent 3D reconstruction, like 3D methods but without retaining a 3D scene reconstruction. The approach is based on two ideas: {\em (i)} a novel proposal where query camera center estimation relies only on relative translation estimates but not relative rotation estimates through a decoupling of the two, and {\em (ii)} a shift from computing optimal pose from estimated relative pose to computing optimal pose from multiview correspondences, thus cutting out the ``middle-man''. Our approach shows improved performance on the 7-Scenes and Cambridge Landmarks datasets while also improving on timing and memory footprint as compared to state-of-the-art.

Via

Access Paper or Ask Questions

Integrating 3D Slicer with a Dynamic Simulator for Situational Aware Robotic Interventions

Jan 22, 2024

Manish Sahu, Hisashi Ishida, Laura Connolly, Hongyi Fan, Anton Deguet, Peter Kazanzides, Francis X. Creighton, Russell H. Taylor, Adnan Munawar

Figure 1 for Integrating 3D Slicer with a Dynamic Simulator for Situational Aware Robotic Interventions

Figure 2 for Integrating 3D Slicer with a Dynamic Simulator for Situational Aware Robotic Interventions

Figure 3 for Integrating 3D Slicer with a Dynamic Simulator for Situational Aware Robotic Interventions

Figure 4 for Integrating 3D Slicer with a Dynamic Simulator for Situational Aware Robotic Interventions

Abstract:Image-guided robotic interventions represent a transformative frontier in surgery, blending advanced imaging and robotics for improved precision and outcomes. This paper addresses the critical need for integrating open-source platforms to enhance situational awareness in image-guided robotic research. We present an open-source toolset that seamlessly combines a physics-based constraint formulation framework, AMBF, with a state-of-the-art imaging platform application, 3D Slicer. Our toolset facilitates the creation of highly customizable interactive digital twins, that incorporates processing and visualization of medical imaging, robot kinematics, and scene dynamics for real-time robot control. Through a feasibility study, we showcase real-time synchronization of a physical robotic interventional environment in both 3D Slicer and AMBF, highlighting low-latency updates and improved visualization.

* *These authors contributed equally

Via

Access Paper or Ask Questions

Condition numbers in multiview geometry, instability in relative pose estimation, and RANSAC

Oct 04, 2023

Hongyi Fan, Joe Kileel, Benjamin Kimia

Abstract:In this paper we introduce a general framework for analyzing the numerical conditioning of minimal problems in multiple view geometry, using tools from computational algebra and Riemannian geometry. Special motivation comes from the fact that relative pose estimation, based on standard 5-point or 7-point Random Sample Consensus (RANSAC) algorithms, can fail even when no outliers are present and there is enough data to support a hypothesis. We argue that these cases arise due to the intrinsic instability of the 5- and 7-point minimal problems. We apply our framework to characterize the instabilities, both in terms of the world scenes that lead to infinite condition number, and directly in terms of ill-conditioned image data. The approach produces computational tests for assessing the condition number before solving the minimal problem. Lastly synthetic and real data experiments suggest that RANSAC serves not only to remove outliers, but also to select for well-conditioned image data, as predicted by our theory.

Via

Access Paper or Ask Questions

On the Instability of Relative Pose Estimation and RANSAC's Role

Dec 29, 2021

Hongyi Fan, Joe Kileel, Benjamin Kimia

Figure 1 for On the Instability of Relative Pose Estimation and RANSAC's Role

Figure 2 for On the Instability of Relative Pose Estimation and RANSAC's Role

Figure 3 for On the Instability of Relative Pose Estimation and RANSAC's Role

Figure 4 for On the Instability of Relative Pose Estimation and RANSAC's Role

Abstract:In this paper we study the numerical instabilities of the 5- and 7-point problems for essential and fundamental matrix estimation in multiview geometry. In both cases we characterize the ill-posed world scenes where the condition number for epipolar estimation is infinite. We also characterize the ill-posed instances in terms of the given image data. To arrive at these results, we present a general framework for analyzing the conditioning of minimal problems in multiview geometry, based on Riemannian manifolds. Experiments with synthetic and real-world data then reveal a striking conclusion: that Random Sample Consensus (RANSAC) in Structure-from-Motion (SfM) does not only serve to filter out outliers, but RANSAC also selects for well-conditioned image data, sufficiently separated from the ill-posed locus that our theory predicts. Our findings suggest that, in future work, one could try to accelerate and increase the success of RANSAC by testing only well-conditioned image data.

* 27 pages, 11 figures, 2 tables

Via

Access Paper or Ask Questions

Benchmarking Pedestrian Odometry: The Brown Pedestrian Odometry Dataset (BPOD)

Dec 24, 2021

David Charatan, Hongyi Fan, Benjamin Kimia

Figure 1 for Benchmarking Pedestrian Odometry: The Brown Pedestrian Odometry Dataset (BPOD)

Figure 2 for Benchmarking Pedestrian Odometry: The Brown Pedestrian Odometry Dataset (BPOD)

Figure 3 for Benchmarking Pedestrian Odometry: The Brown Pedestrian Odometry Dataset (BPOD)

Figure 4 for Benchmarking Pedestrian Odometry: The Brown Pedestrian Odometry Dataset (BPOD)

Abstract:We present the Brown Pedestrian Odometry Dataset (BPOD) for benchmarking visual odometry algorithms in head-mounted pedestrian settings. This dataset was captured using synchronized global and rolling shutter stereo cameras in 12 diverse indoor and outdoor locations on Brown University's campus. Compared to existing datasets, BPOD contains more image blur and self-rotation, which are common in pedestrian odometry but rare elsewhere. Ground-truth trajectories are generated from stick-on markers placed along the pedestrian's path, and the pedestrian's position is documented using a third-person video. We evaluate the performance of representative direct, feature-based, and learning-based VO methods on BPOD. Our results show that significant development is needed to successfully capture pedestrian trajectories. The link to the dataset is here: \url{https://doi.org/10.26300/c1n7-7p93

Via

Access Paper or Ask Questions

GPU-Based Homotopy Continuation for Minimal Problems in Computer Vision

Dec 13, 2021

Chiang-Heng Chien, Hongyi Fan, Ahmad Abdelfattah, Elias Tsigaridas, Stanimire Tomov, Benjamin Kimia

Figure 1 for GPU-Based Homotopy Continuation for Minimal Problems in Computer Vision

Figure 2 for GPU-Based Homotopy Continuation for Minimal Problems in Computer Vision

Figure 3 for GPU-Based Homotopy Continuation for Minimal Problems in Computer Vision

Figure 4 for GPU-Based Homotopy Continuation for Minimal Problems in Computer Vision

Abstract:Systems of polynomial equations arise frequently in computer vision, especially in multiview geometry problems. Traditional methods for solving these systems typically aim to eliminate variables to reach a univariate polynomial, e.g., a tenth-order polynomial for 5-point pose estimation, using clever manipulations, or more generally using Grobner basis, resultants, and elimination templates, leading to successful algorithms for multiview geometry and other problems. However, these methods do not work when the problem is complex and when they do, they face efficiency and stability issues. Homotopy Continuation (HC) can solve more complex problems without the stability issues, and with guarantees of a global solution, but they are known to be slow. In this paper we show that HC can be parallelized on a GPU, showing significant speedups up to 26 times on polynomial benchmarks. We also show that GPU-HC can be generically applied to a range of computer vision problems, including 4-view triangulation and trifocal pose estimation with unknown focal length, which cannot be solved with elimination template but they can be efficiently solved with HC. GPU-HC opens the door to easy formulation and solution of a range of computer vision problems.

Via

Access Paper or Ask Questions

Trifocal Relative Pose from Lines at Points and its Efficient Solution

Apr 16, 2019

Ricardo Fabbri, Timothy Duff, Hongyi Fan, Margaret Regan, David da Costa de Pinho, Elias Tsigaridas, Charles Wampler, Jonathan Hauenstein, Benjamin Kimia, Anton Leykin(+1 more)

Figure 1 for Trifocal Relative Pose from Lines at Points and its Efficient Solution

Figure 2 for Trifocal Relative Pose from Lines at Points and its Efficient Solution

Figure 3 for Trifocal Relative Pose from Lines at Points and its Efficient Solution

Figure 4 for Trifocal Relative Pose from Lines at Points and its Efficient Solution

Abstract:We present a new minimal problem for relative pose estimation mixing point features with lines incident at points observed in three views and its efficient homotopy continuation solver. We demonstrate the generality of the approach by analyzing and solving an additional problem with mixed point and line correspondences in three views. The minimal problems include correspondences of (i) three points and one line and (ii) three points and two lines through two of the points which is reported and analyzed here for the first time. These are difficult to solve, as they have 216 and - as shown here - 312 solutions, but cover important practical situations when line and point features appear together, e.g., in urban scenes or when observing curves. We demonstrate that even such difficult problems can be solved robustly using a suitable homotopy continuation technique and we provide an implementation optimized for minimal problems that can be integrated into engineering applications. Our simulated and real experiments demonstrate our solvers in the camera geometry computation task in structure from motion. We show that new solvers allow for reconstructing challenging scenes where the standard two-view initialization of structure from motion fails.

* This material is based upon work supported by the National Science Foundation under Grant No. DMS-1439786 while most authors were in residence at Brown University's Institute for Computational and Experimental Research in Mathematics -- ICERM, in Providence, RI

Via

Access Paper or Ask Questions