Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ignacio Alzugaray

Vision for Robotics Lab, ETH Zurich, Switzerland

LiDAR Loop Closure Detection using Semantic Graphs with Graph Attention Networks

Jan 31, 2025

Liudi Yang, Ruben Mascaro, Ignacio Alzugaray, Sai Manoj Prakhya, Marco Karrer, Ziyuan Liu, Margarita Chli

Abstract:In this paper, we propose a novel loop closure detection algorithm that uses graph attention neural networks to encode semantic graphs to perform place recognition and then use semantic registration to estimate the 6 DoF relative pose constraint. Our place recognition algorithm has two key modules, namely, a semantic graph encoder module and a graph comparison module. The semantic graph encoder employs graph attention networks to efficiently encode spatial, semantic and geometric information from the semantic graph of the input point cloud. We then use self-attention mechanism in both node-embedding and graph-embedding steps to create distinctive graph vectors. The graph vectors of the current scan and a keyframe scan are then compared in the graph comparison module to identify a possible loop closure. Specifically, employing the difference of the two graph vectors showed a significant improvement in performance, as shown in ablation studies. Lastly, we implemented a semantic registration algorithm that takes in loop closure candidate scans and estimates the relative 6 DoF pose constraint for the LiDAR SLAM system. Extensive evaluation on public datasets shows that our model is more accurate and robust, achieving 13% improvement in maximum F1 score on the SemanticKITTI dataset, when compared to the baseline semantic graph algorithm. For the benefit of the community, we open-source the complete implementation of our proposed algorithm and custom implementation of semantic registration at https://github.com/crepuscularlight/SemanticLoopClosure

* Journal of Intelligent & Robotic Systems, 2025

Via

Access Paper or Ask Questions

Hyperion -- A fast, versatile symbolic Gaussian Belief Propagation framework for Continuous-Time SLAM

Jul 09, 2024

David Hug, Ignacio Alzugaray, Margarita Chli

Figure 1 for Hyperion -- A fast, versatile symbolic Gaussian Belief Propagation framework for Continuous-Time SLAM

Figure 2 for Hyperion -- A fast, versatile symbolic Gaussian Belief Propagation framework for Continuous-Time SLAM

Figure 3 for Hyperion -- A fast, versatile symbolic Gaussian Belief Propagation framework for Continuous-Time SLAM

Figure 4 for Hyperion -- A fast, versatile symbolic Gaussian Belief Propagation framework for Continuous-Time SLAM

Abstract:Continuous-Time Simultaneous Localization And Mapping (CTSLAM) has become a promising approach for fusing asynchronous and multi-modal sensor suites. Unlike discrete-time SLAM, which estimates poses discretely, CTSLAM uses continuous-time motion parametrizations, facilitating the integration of a variety of sensors such as rolling-shutter cameras, event cameras and Inertial Measurement Units (IMUs). However, CTSLAM approaches remain computationally demanding and are conventionally posed as centralized Non-Linear Least Squares (NLLS) optimizations. Targeting these limitations, we not only present the fastest SymForce-based [Martiros et al., RSS 2022] B- and Z-Spline implementations achieving speedups between 2.43x and 110.31x over Sommer et al. [CVPR 2020] but also implement a novel continuous-time Gaussian Belief Propagation (GBP) framework, coined Hyperion, which targets decentralized probabilistic inference across agents. We demonstrate the efficacy of our method in motion tracking and localization settings, complemented by empirical ablation studies.

* To be published in ECCV 2024

Via

Access Paper or Ask Questions

PixRO: Pixel-Distributed Rotational Odometry with Gaussian Belief Propagation

Jun 14, 2024

Ignacio Alzugaray, Riku Murai, Andrew Davison

Abstract:Visual sensors are not only becoming better at capturing high-quality images but also they have steadily increased their capabilities in processing data on their own on-chip. Yet the majority of VO pipelines rely on the transmission and processing of full images in a centralized unit (e.g. CPU or GPU), which often contain much redundant and low-quality information for the task. In this paper, we address the task of frame-to-frame rotational estimation but, instead of reasoning about relative motion between frames using the full images, distribute the estimation at pixel-level. In this paradigm, each pixel produces an estimate of the global motion by only relying on local information and local message-passing with neighbouring pixels. The resulting per-pixel estimates can then be communicated to downstream tasks, yielding higher-level, informative cues instead of the original raw pixel-readings. We evaluate the proposed approach on real public datasets, where we offer detailed insights about this novel technique and open-source our implementation for the future benefit of the community.

Via

Access Paper or Ask Questions

Distributed Simultaneous Localisation and Auto-Calibration using Gaussian Belief Propagation

Jan 26, 2024

Riku Murai, Ignacio Alzugaray, Paul H. J. Kelly, Andrew J. Davison

Figure 1 for Distributed Simultaneous Localisation and Auto-Calibration using Gaussian Belief Propagation

Figure 2 for Distributed Simultaneous Localisation and Auto-Calibration using Gaussian Belief Propagation

Figure 3 for Distributed Simultaneous Localisation and Auto-Calibration using Gaussian Belief Propagation

Figure 4 for Distributed Simultaneous Localisation and Auto-Calibration using Gaussian Belief Propagation

Abstract:We present a novel scalable, fully distributed, and online method for simultaneous localisation and extrinsic calibration for multi-robot setups. Individual a priori unknown robot poses are probabilistically inferred as robots sense each other while simultaneously calibrating their sensors and markers extrinsic using Gaussian Belief Propagation. In the presented experiments, we show how our method not only yields accurate robot localisation and auto-calibration but also is able to perform under challenging circumstances such as highly noisy measurements, significant communication failures or limited communication range.

* IEEE Robotics and Automation Letters, vol. 9, no. 3, pp. 2136-2143, March 2024
* Published in IEEE Robotics and Automation Letters (RA-L) 2024

Via

Access Paper or Ask Questions

Fit-NGP: Fitting Object Models to Neural Graphics Primitives

Jan 04, 2024

Marwan Taher, Ignacio Alzugaray, Andrew J. Davison

Abstract:Accurate 3D object pose estimation is key to enabling many robotic applications that involve challenging object interactions. In this work, we show that the density field created by a state-of-the-art efficient radiance field reconstruction method is suitable for highly accurate and robust pose estimation for objects with known 3D models, even when they are very small and with challenging reflective surfaces. We present a fully automatic object pose estimation system based on a robot arm with a single wrist-mounted camera, which can scan a scene from scratch, detect and estimate the 6-Degrees of Freedom (DoF) poses of multiple objects within a couple of minutes of operation. Small objects such as bolts and nuts are estimated with accuracy on order of 1mm.

Via

Access Paper or Ask Questions

Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models

Dec 07, 2023

Ivan Kapelyukh, Yifei Ren, Ignacio Alzugaray, Edward Johns

Figure 1 for Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models

Figure 2 for Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models

Figure 3 for Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models

Figure 4 for Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models

Abstract:We introduce Dream2Real, a robotics framework which integrates vision-language models (VLMs) trained on 2D data into a 3D object rearrangement pipeline. This is achieved by the robot autonomously constructing a 3D representation of the scene, where objects can be rearranged virtually and an image of the resulting arrangement rendered. These renders are evaluated by a VLM, so that the arrangement which best satisfies the user instruction is selected and recreated in the real world with pick-and-place. This enables language-conditioned rearrangement to be performed zero-shot, without needing to collect a training dataset of example arrangements. Results on a series of real-world tasks show that this framework is robust to distractors, controllable by language, capable of understanding complex multi-object relations, and readily applicable to both tabletop and 6-DoF rearrangement tasks.

* Project webpage with videos: https://www.robot-learning.uk/dream2real

Via

Access Paper or Ask Questions

Continuous-Time Gaussian Process Motion-Compensation for Event-vision Pattern Tracking with Distance Fields

Mar 05, 2023

Cedric Le Gentil, Ignacio Alzugaray, Teresa Vidal-Calleja

Figure 1 for Continuous-Time Gaussian Process Motion-Compensation for Event-vision Pattern Tracking with Distance Fields

Figure 2 for Continuous-Time Gaussian Process Motion-Compensation for Event-vision Pattern Tracking with Distance Fields

Figure 3 for Continuous-Time Gaussian Process Motion-Compensation for Event-vision Pattern Tracking with Distance Fields

Figure 4 for Continuous-Time Gaussian Process Motion-Compensation for Event-vision Pattern Tracking with Distance Fields

Abstract:This work addresses the issue of motion compensation and pattern tracking in event camera data. An event camera generates asynchronous streams of events triggered independently by each of the pixels upon changes in the observed intensity. Providing great advantages in low-light and rapid-motion scenarios, such unconventional data present significant research challenges as traditional vision algorithms are not directly applicable to this sensing modality. The proposed method decomposes the tracking problem into a local SE(2) motion-compensation step followed by a homography registration of small motion-compensated event batches. The first component relies on Gaussian Process (GP) theory to model the continuous occupancy field of the events in the image plane and embed the camera trajectory in the covariance kernel function. In doing so, estimating the trajectory is done similarly to GP hyperparameter learning by maximising the log marginal likelihood of the data. The continuous occupancy fields are turned into distance fields and used as templates for homography-based registration. By benchmarking the proposed method against other state-of-the-art techniques, we show that our open-source implementation performs high-accuracy motion compensation and produces high-quality tracks in real-world scenarios.

* Accepted for presentation at the 2023 IEEE International Conference on Robotics and Automation

Via

Access Paper or Ask Questions

IDOL: A Framework for IMU-DVS Odometry using Lines

Aug 13, 2020

Cedric Le Gentil, Florian Tschopp, Ignacio Alzugaray, Teresa Vidal-Calleja, Roland Siegwart, Juan Nieto

Figure 1 for IDOL: A Framework for IMU-DVS Odometry using Lines

Figure 2 for IDOL: A Framework for IMU-DVS Odometry using Lines

Figure 3 for IDOL: A Framework for IMU-DVS Odometry using Lines

Figure 4 for IDOL: A Framework for IMU-DVS Odometry using Lines

Abstract:In this paper, we introduce IDOL, an optimization-based framework for IMU-DVS Odometry using Lines. Event cameras, also called Dynamic Vision Sensors (DVSs), generate highly asynchronous streams of events triggered upon illumination changes for each individual pixel. This novel paradigm presents advantages in low illumination conditions and high-speed motions. Nonetheless, this unconventional sensing modality brings new challenges to perform scene reconstruction or motion estimation. The proposed method offers to leverage a continuous-time representation of the inertial readings to associate each event with timely accurate inertial data. The method's front-end extracts event clusters that belong to line segments in the environment whereas the back-end estimates the system's trajectory alongside the lines' 3D position by minimizing point-to-line distances between individual events and the lines' projection in the image space. A novel attraction/repulsion mechanism is presented to accurately estimate the lines' extremities, avoiding their explicit detection in the event data. The proposed method is benchmarked against a state-of-the-art frame-based visual-inertial odometry framework using public datasets. The results show that IDOL performs at the same order of magnitude on most datasets and even shows better orientation estimates. These findings can have a great impact on new algorithms for DVS.

* Cedric Le Gentil and Florian Tschopp contributed equally to this work

Via

Access Paper or Ask Questions