Abstract:Recent advances in robotics are pushing real-world autonomy, enabling robots to perform long-term and large-scale missions. A crucial component for successful missions is the incorporation of loop closures through place recognition, which effectively mitigates accumulated pose estimation drift. Despite computational advancements, optimizing performance for real-time deployment remains challenging, especially in resource-constrained mobile robots and multi-robot systems since, conventional keyframe sampling practices in place recognition often result in retaining redundant information or overlooking relevant data, as they rely on fixed sampling intervals or work directly in the 3D space instead of the feature space. To address these concerns, we introduce the concept of sample space in place recognition and demonstrate how different sampling techniques affect the query process and overall performance. We then present a novel keyframe sampling approach for LiDAR-based place recognition, which focuses on redundancy minimization and information preservation in the hyper-dimensional descriptor space. This approach is applicable to both learning-based and handcrafted descriptors, and through the experimental validation across multiple datasets and descriptor frameworks, we demonstrate the effectiveness of our proposed method, showing it can jointly minimize redundancy and preserve essential information in real-time. The proposed approach maintains robust performance across various datasets without requiring parameter tuning, contributing to more efficient and reliable place recognition for a wide range of robotic applications.
Abstract:This article studies the commonsense object affordance concept for enabling close-to-human task planning and task optimization of embodied robotic agents in urban environments. The focus of the object affordance is on reasoning how to effectively identify object's inherent utility during the task execution, which in this work is enabled through the analysis of contextual relations of sparse information of 3D scene graphs. The proposed framework develops a Correlation Information (CECI) model to learn probability distributions using a Graph Convolutional Network, allowing to extract the commonsense affordance for individual members of a semantic class. The overall framework was experimentally validated in a real-world indoor environment, showcasing the ability of the method to level with human commonsense. For a video of the article, showcasing the experimental demonstration, please refer to the following link: https://youtu.be/BDCMVx2GiQE
Abstract:Object detection and global localization play a crucial role in robotics, spanning across a great spectrum of applications from autonomous cars to multi-layered 3D Scene Graphs for semantic scene understanding. This article proposes BOX3D, a novel multi-modal and lightweight scheme for localizing objects of interest by fusing the information from RGB camera and 3D LiDAR. BOX3D is structured around a three-layered architecture, building up from the local perception of the incoming sequential sensor data to the global perception refinement that covers for outliers and the general consistency of each object's observation. More specifically, the first layer handles the low-level fusion of camera and LiDAR data for initial 3D bounding box extraction. The second layer converts each LiDAR's scan 3D bounding boxes to the world coordinate frame and applies a spatial pairing and merging mechanism to maintain the uniqueness of objects observed from different viewpoints. Finally, BOX3D integrates the third layer that supervises the consistency of the results on the global map iteratively, using a point-to-voxel comparison for identifying all points in the global map that belong to the object. Benchmarking results of the proposed novel architecture are showcased in multiple experimental trials on public state-of-the-art large-scale dataset of urban environments.
Abstract:In this article, a novel approach for merging 3D point cloud maps in the context of egocentric multi-robot exploration is presented. Unlike traditional methods, the proposed approach leverages state-of-the-art place recognition and learned descriptors to efficiently detect overlap between maps, eliminating the need for the time-consuming global feature extraction and feature matching process. The estimated overlapping regions are used to calculate a homogeneous rigid transform, which serves as an initial condition for the GICP point cloud registration algorithm to refine the alignment between the maps. The advantages of this approach include faster processing time, improved accuracy, and increased robustness in challenging environments. Furthermore, the effectiveness of the proposed framework is successfully demonstrated through multiple field missions of robot exploration in a variety of different underground environments.
Abstract:In the field of resource-constrained robots and the need for effective place recognition in multi-robotic systems, this article introduces RecNet, a novel approach that concurrently addresses both challenges. The core of RecNet's methodology involves a transformative process: it projects 3D point clouds into depth images, compresses them using an encoder-decoder framework, and subsequently reconstructs the range image, seamlessly restoring the original point cloud. Additionally, RecNet utilizes the latent vector extracted from this process for efficient place recognition tasks. This unique approach not only achieves comparable place recognition results but also maintains a compact representation, suitable for seamless sharing among robots to reconstruct their collective maps. The evaluation of RecNet encompasses an array of metrics, including place recognition performance, structural similarity of the reconstructed point clouds, and the bandwidth transmission advantages, derived from sharing only the latent vectors. This reconstructed map paves a groundbreaking way for exploring its usability in navigation, localization, map-merging, and other relevant missions. Our proposed approach is rigorously assessed using both a publicly available dataset and field experiments, confirming its efficacy and potential for real-world applications.
Abstract:This paper presents a framework addressing the challenge of global localization in autonomous mobile robotics by integrating LiDAR-based descriptors and Wi-Fi fingerprinting in a pre-mapped environment. This is motivated by the increasing demand for reliable localization in complex scenarios, such as urban areas or underground mines, requiring robust systems able to overcome limitations faced by traditional Global Navigation Satellite System (GNSS)-based localization methods. By leveraging the complementary strengths of LiDAR and Wi-Fi sensors used to generate predictions and evaluate the confidence of each prediction as an indicator of potential degradation, we propose a redundancy-based approach that enhances the system's overall robustness and accuracy. The proposed framework allows independent operation of the LiDAR and Wi-Fi sensors, ensuring system redundancy. By combining the predictions while considering their confidence levels, we achieve enhanced and consistent performance in localization tasks.
Abstract:Change detection and irregular object extraction in 3D point clouds is a challenging task that is of high importance not only for autonomous navigation but also for updating existing digital twin models of various industrial environments. This article proposes an innovative approach for change detection in 3D point clouds using deep learned place recognition descriptors and irregular object extraction based on voxel-to-point comparison. The proposed method first aligns the bi-temporal point clouds using a map-merging algorithm in order to establish a common coordinate frame. Then, it utilizes deep learning techniques to extract robust and discriminative features from the 3D point cloud scans, which are used to detect changes between consecutive point cloud frames and therefore find the changed areas. Finally, the altered areas are sampled and compared between the two time instances to extract any obstructions that caused the area to change. The proposed method was successfully evaluated in real-world field experiments, where it was able to detect different types of changes in 3D point clouds, such as object or muck-pile addition and displacement, showcasing the effectiveness of the approach. The results of this study demonstrate important implications for various applications, including safety and security monitoring in construction sites, mapping and exploration and suggests potential future research directions in this field.
Abstract:Algorithms for autonomous navigation in environments without Global Navigation Satellite System (GNSS) coverage mainly rely on onboard perception systems. These systems commonly incorporate sensors like cameras and LiDARs, the performance of which may degrade in the presence of aerosol particles. Thus, there is a need of fusing acquired data from these sensors with data from RADARs which can penetrate through such particles. Overall, this will improve the performance of localization and collision avoidance algorithms under such environmental conditions. This paper introduces a multimodal dataset from the harsh and unstructured underground environment with aerosol particles. A detailed description of the onboard sensors and the environment, where the dataset is collected are presented to enable full evaluation of acquired data. Furthermore, the dataset contains synchronized raw data measurements from all onboard sensors in Robot Operating System (ROS) format to facilitate the evaluation of navigation, and localization algorithms in such environments. In contrast to the existing datasets, the focus of this paper is not only to capture both temporal and spatial data diversities but also to present the impact of harsh conditions on captured data. Therefore, to validate the dataset, a preliminary comparison of odometry from onboard LiDARs is presented.
Abstract:This article presents a 3D point cloud map-merging framework for egocentric heterogeneous multi-robot exploration, based on overlap detection and alignment, that is independent of a manual initial guess or prior knowledge of the robots' poses. The novel proposed solution utilizes state-of-the-art place recognition learned descriptors, that through the framework's main pipeline, offer a fast and robust region overlap estimation, hence eliminating the need for the time-consuming global feature extraction and feature matching process that is typically used in 3D map integration. The region overlap estimation provides a homogeneous rigid transform that is applied as an initial condition in the point cloud registration algorithm Fast-GICP, which provides the final and refined alignment. The efficacy of the proposed framework is experimentally evaluated based on multiple field multi-robot exploration missions in underground environments, where both ground and aerial robots are deployed, with different sensor configurations.
Abstract:Current global re-localization algorithms are built on top of localization and mapping methods and heavily rely on scan matching and direct point cloud feature extraction and therefore are vulnerable in featureless demanding environments like caves and tunnels. In this article, we propose a novel global re-localization framework that: a) does not require an initial guess, like most methods do, while b) it has the capability to offer the top-k candidates to choose from and last but not least provides an event-based re-localization trigger module for enabling, and c) supporting completely autonomous robotic missions. With the focus on subterranean environments with low features, we opt to use descriptors based on range images from 3D LiDAR scans in order to maintain the depth information of the environment. In our novel approach, we make use of a state-of-the-art data-driven descriptor extraction framework for place recognition and orientation regression and enhance it with the addition of a junction detection module that also utilizes the descriptors for classification purposes.