Topic:Loop Closure Detection
What is Loop Closure Detection? Loop closure detection is the process of identifying previously visited locations in robot navigation or SLAM systems.
Papers and Code
Apr 18, 2025
Abstract:We present an autonomous aerial surveillance platform, Veg, designed as a fault-tolerant quadcopter system that integrates visual SLAM for GPS-independent navigation, advanced control architecture for dynamic stability, and embedded vision modules for real-time object and face recognition. The platform features a cascaded control design with an LQR inner-loop and PD outer-loop trajectory control. It leverages ORB-SLAM3 for 6-DoF localization and loop closure, and supports waypoint-based navigation through Dijkstra path planning over SLAM-derived maps. A real-time Failure Detection and Identification (FDI) system detects rotor faults and executes emergency landing through re-routing. The embedded vision system, based on a lightweight CNN and PCA, enables onboard object detection and face recognition with high precision. The drone operates fully onboard using a Raspberry Pi 4 and Arduino Nano, validated through simulations and real-world testing. This work consolidates real-time localization, fault recovery, and embedded AI on a single platform suitable for constrained environments.
* 18 pages, 21 figures, 4 tables. Onboard processing using Raspberry Pi
4 and Arduino Nano. Includes ORB-SLAM3-based navigation, LQR control, rotor
fault recovery, object detection, and PCA face recognition. Real-world and
simulation tests included. Designed for GPS-denied autonomous UAV
surveillance
Via

Apr 11, 2025
Abstract:LiDAR loop closure detection (LCD) is crucial for consistent Simultaneous Localization and Mapping (SLAM) but faces challenges in robustness and accuracy. Existing methods, including semantic graph approaches, often suffer from coarse geometric representations and lack temporal robustness against noise, dynamics, and viewpoint changes. We introduce PNE-SGAN, a Probabilistic NDT-Enhanced Semantic Graph Attention Network, to overcome these limitations. PNE-SGAN enhances semantic graphs by using Normal Distributions Transform (NDT) covariance matrices as rich, discriminative geometric node features, processed via a Graph Attention Network (GAT). Crucially, it integrates graph similarity scores into a probabilistic temporal filtering framework (modeled as an HMM/Bayes filter), incorporating uncertain odometry for motion modeling and utilizing forward-backward smoothing to effectively handle ambiguities. Evaluations on challenging KITTI sequences (00 and 08) demonstrate state-of-the-art performance, achieving Average Precision of 96.2\% and 95.1\%, respectively. PNE-SGAN significantly outperforms existing methods, particularly in difficult bidirectional loop scenarios where others falter. By synergizing detailed NDT geometry with principled probabilistic temporal reasoning, PNE-SGAN offers a highly accurate and robust solution for LiDAR LCD, enhancing SLAM reliability in complex, large-scale environments.
Via

Apr 02, 2025
Abstract:For utilizing autonomous vehicle in urban areas a reliable localization is needed. Especially when HD maps are used, a precise and repeatable method has to be chosen. Therefore accurate map generation but also re-localization against these maps is necessary. Due to best 3D reconstruction of the surrounding, LiDAR has become a reliable modality for localization. The latest LiDAR odometry estimation are based on iterative closest point (ICP) approaches, namely KISS-ICP and SAGE-ICP. We extend the capabilities of KISS-ICP by incorporating semantic information into the point alignment process using a generalizable approach with minimal parameter tuning. This enhancement allows us to surpass KISS-ICP in terms of absolute trajectory error (ATE), the primary metric for map accuracy. Additionally, we improve the Cartographer mapping framework to handle semantic information. Cartographer facilitates loop closure detection over larger areas, mitigating odometry drift and further enhancing ATE accuracy. By integrating semantic information into the mapping process, we enable the filtering of specific classes, such as parked vehicles, from the resulting map. This filtering improves relocalization quality by addressing temporal changes, such as vehicles being moved.
Via

Mar 29, 2025
Abstract:Event cameras asynchronously output low-latency event streams, promising for state estimation in high-speed motion and challenging lighting conditions. As opposed to frame-based cameras, the motion-dependent nature of event cameras presents persistent challenges in achieving robust event feature detection and matching. In recent years, learning-based approaches have demonstrated superior robustness over traditional handcrafted methods in feature detection and matching, particularly under aggressive motion and HDR scenarios. In this paper, we propose SuperEIO, a novel framework that leverages the learning-based event-only detection and IMU measurements to achieve event-inertial odometry. Our event-only feature detection employs a convolutional neural network under continuous event streams. Moreover, our system adopts the graph neural network to achieve event descriptor matching for loop closure. The proposed system utilizes TensorRT to accelerate the inference speed of deep networks, which ensures low-latency processing and robust real-time operation on resource-limited platforms. Besides, we evaluate our method extensively on multiple public datasets, demonstrating its superior accuracy and robustness compared to other state-of-the-art event-based methods. We have also open-sourced our pipeline to facilitate research in the field: https://github.com/arclab-hku/SuperEIO.
Via

Mar 04, 2025
Abstract:Simultaneous Localization and Mapping (SLAM) allows mobile robots to navigate without external positioning systems or pre-existing maps. Radar is emerging as a valuable sensing tool, especially in vision-obstructed environments, as it is less affected by particles than lidars or cameras. Modern 4D imaging radars provide three-dimensional geometric information and relative velocity measurements, but they bring challenges, such as a small field of view and sparse, noisy point clouds. Detecting loop closures in SLAM is critical for reducing trajectory drift and maintaining map accuracy. However, the directional nature of 4D radar data makes identifying loop closures, especially from reverse viewpoints, difficult due to limited scan overlap. This article explores using 4D radar for loop closure in SLAM, focusing on similar and opposing viewpoints. We generate submaps for a denser environment representation and use introspective measures to reject false detections in feature-degenerate environments. Our experiments show accurate loop closure detection in geometrically diverse settings for both similar and opposing viewpoints, improving trajectory estimation with up to 82 % improvement in ATE and rejecting false positives in self-similar environments.
* This paper has been accepted for publication in the IEEE
International Conference on Robotics and Automation(ICRA), 2025
Via

Mar 06, 2025
Abstract:Place recognition is essential to maintain global consistency in large-scale localization systems. While research in urban environments has progressed significantly using LiDARs or cameras, applications in natural forest-like environments remain largely under-explored. Furthermore, forests present particular challenges due to high self-similarity and substantial variations in vegetation growth over time. In this work, we propose a robust LiDAR-based place recognition method for natural forests, ForestLPR. We hypothesize that a set of cross-sectional images of the forest's geometry at different heights contains the information needed to recognize revisiting a place. The cross-sectional images are represented by \ac{bev} density images of horizontal slices of the point cloud at different heights. Our approach utilizes a visual transformer as the shared backbone to produce sets of local descriptors and introduces a multi-BEV interaction module to attend to information at different heights adaptively. It is followed by an aggregation layer that produces a rotation-invariant place descriptor. We evaluated the efficacy of our method extensively on real-world data from public benchmarks as well as robotic datasets and compared it against the state-of-the-art (SOTA) methods. The results indicate that ForestLPR has consistently good performance on all evaluations and achieves an average increase of 7.38\% and 9.11\% on Recall@1 over the closest competitor on intra-sequence loop closure detection and inter-sequence re-localization, respectively, validating our hypothesis
* accepted by CVPR2025
Via

Feb 26, 2025
Abstract:This work introduces BEV-LIO(LC), a novel LiDAR-Inertial Odometry (LIO) framework that combines Bird's Eye View (BEV) image representations of LiDAR data with geometry-based point cloud registration and incorporates loop closure (LC) through BEV image features. By normalizing point density, we project LiDAR point clouds into BEV images, thereby enabling efficient feature extraction and matching. A lightweight convolutional neural network (CNN) based feature extractor is employed to extract distinctive local and global descriptors from the BEV images. Local descriptors are used to match BEV images with FAST keypoints for reprojection error construction, while global descriptors facilitate loop closure detection. Reprojection error minimization is then integrated with point-to-plane registration within an iterated Extended Kalman Filter (iEKF). In the back-end, global descriptors are used to create a KD-tree-indexed keyframe database for accurate loop closure detection. When a loop closure is detected, Random Sample Consensus (RANSAC) computes a coarse transform from BEV image matching, which serves as the initial estimate for Iterative Closest Point (ICP). The refined transform is subsequently incorporated into a factor graph along with odometry factors, improving the global consistency of localization. Extensive experiments conducted in various scenarios with different LiDAR types demonstrate that BEV-LIO(LC) outperforms state-of-the-art methods, achieving competitive localization accuracy. Our code, video and supplementary materials can be found at https://github.com/HxCa1/BEV-LIO-LC.
Via

Feb 26, 2025
Abstract:Visual SLAM is essential for mobile robots, drone navigation, and VR/AR, but traditional RGB camera systems struggle in low-light conditions, driving interest in thermal SLAM, which excels in such environments. However, thermal imaging faces challenges like low contrast, high noise, and limited large-scale annotated datasets, restricting the use of deep learning in outdoor scenarios. We present DarkSLAM, a noval deep learning-based monocular thermal SLAM system designed for large-scale localization and reconstruction in complex lighting conditions.Our approach incorporates the Efficient Channel Attention (ECA) mechanism in visual odometry and the Selective Kernel Attention (SKA) mechanism in depth estimation to enhance pose accuracy and mitigate thermal depth degradation. Additionally, the system includes thermal depth-based loop closure detection and pose optimization, ensuring robust performance in low-texture thermal scenes. Extensive outdoor experiments demonstrate that DarkSLAM significantly outperforms existing methods like SC-Sfm-Learner and Shin et al., delivering precise localization and 3D dense mapping even in challenging nighttime environments.
Via

Feb 25, 2025
Abstract:Works based on localization and mapping do not exploit the inherent semantic-relational information from the environment for faster and efficient management and optimization of the robot poses and its map elements, often leading to pose and map inaccuracies and computational inefficiencies in large scale environments. 3D scene graph representations which distributes the environment in an hierarchical manner can be exploited to enhance the management/optimization of underlying robot poses and its map. In this direction, we present our work Situational Graphs 2.0, which leverages the hierarchical structure of indoor scenes for efficient data management and optimization. Our algorithm begins by constructing a situational graph that organizes the environment into four layers: Keyframes, Walls, Rooms, and Floors. Our first novelty lies in the front-end which includes a floor detection module capable of identifying stairways and assigning a floor-level semantic-relations to the underlying layers. This floor-level semantic enables a floor-based loop closure strategy, rejecting false-positive loop closures in visually similar areas on different floors. Our second novelty is in exploiting the hierarchy for an improved optimization. It consists of: (1) local optimization, optimizing a window of recent keyframes and their connected components, (2) floor-global optimization, which focuses only on keyframes and their connections within the current floor during loop closures, and (3) room-local optimization, marginalizing redundant keyframes that share observations within the room. We validate our algorithm extensively in different real multi-floor environments. Our approach can demonstrate state-of-art-art results in large scale multi-floor environments creating hierarchical maps while bounding the computational complexity where several baseline works fail to execute efficiently.
* 8 pages, 9 figures, RAL submission
Via

Jan 31, 2025
Abstract:In this paper, we propose a novel loop closure detection algorithm that uses graph attention neural networks to encode semantic graphs to perform place recognition and then use semantic registration to estimate the 6 DoF relative pose constraint. Our place recognition algorithm has two key modules, namely, a semantic graph encoder module and a graph comparison module. The semantic graph encoder employs graph attention networks to efficiently encode spatial, semantic and geometric information from the semantic graph of the input point cloud. We then use self-attention mechanism in both node-embedding and graph-embedding steps to create distinctive graph vectors. The graph vectors of the current scan and a keyframe scan are then compared in the graph comparison module to identify a possible loop closure. Specifically, employing the difference of the two graph vectors showed a significant improvement in performance, as shown in ablation studies. Lastly, we implemented a semantic registration algorithm that takes in loop closure candidate scans and estimates the relative 6 DoF pose constraint for the LiDAR SLAM system. Extensive evaluation on public datasets shows that our model is more accurate and robust, achieving 13% improvement in maximum F1 score on the SemanticKITTI dataset, when compared to the baseline semantic graph algorithm. For the benefit of the community, we open-source the complete implementation of our proposed algorithm and custom implementation of semantic registration at https://github.com/crepuscularlight/SemanticLoopClosure
* Journal of Intelligent & Robotic Systems, 2025
Via
