Abstract:Radar odometry estimation has emerged as a critical technique in the field of autonomous navigation, providing robust and reliable motion estimation under various environmental conditions. Despite its potential, the complex nature of radar signals and the inherent challenges associated with processing these signals have limited the widespread adoption of this technology. This paper aims to address these challenges by proposing novel improvements to an existing method for radar odometry estimation, designed to enhance accuracy and reliability in diverse scenarios. Our pipeline consists of filtering, motion compensation, oriented surface points computation, smoothing, one-to-many radar scan registration, and pose refinement. The developed method enforces local understanding of the scene, by adding additional information through smoothing techniques, and alignment of consecutive scans, as a refinement posterior to the one-to-many registration. We present an in-depth investigation of the contribution of each improvement to the localization accuracy, and we benchmark our system on the sequences of the main datasets for radar understanding, i.e., the Oxford Radar RobotCar, MulRan, and Boreas datasets. The proposed pipeline is able to achieve superior results, on all scenarios considered and under harsh environmental constraints.
Abstract:Loop Closure Detection (LCD) is an essential task in robotics and computer vision, serving as a fundamental component for various applications across diverse domains. These applications encompass object recognition, image retrieval, and video analysis. LCD consists in identifying whether a robot has returned to a previously visited location, referred to as a loop, and then estimating the related roto-translation with respect to the analyzed location. Despite the numerous advantages of radar sensors, such as their ability to operate under diverse weather conditions and provide a wider range of view compared to other commonly used sensors (e.g., cameras or LiDARs), integrating radar data remains an arduous task due to intrinsic noise and distortion. To address this challenge, this research introduces RadarLCD, a novel supervised deep learning pipeline specifically designed for Loop Closure Detection using the FMCW Radar (Frequency Modulated Continuous Wave) sensor. RadarLCD, a learning-based LCD methodology explicitly designed for radar systems, makes a significant contribution by leveraging the pre-trained HERO (Hybrid Estimation Radar Odometry) model. Being originally developed for radar odometry, HERO's features are used to select key points crucial for LCD tasks. The methodology undergoes evaluation across a variety of FMCW Radar dataset scenes, and it is compared to state-of-the-art systems such as Scan Context for Place Recognition and ICP for Loop Closure. The results demonstrate that RadarLCD surpasses the alternatives in multiple aspects of Loop Closure Detection.
Abstract:In the last years, Denoising Diffusion Probabilistic Models (DDPMs) obtained state-of-the-art results in many generative tasks, outperforming GANs and other classes of generative models. In particular, they reached impressive results in various image generation sub-tasks, among which conditional generation tasks such as text-guided image synthesis. Given the success of DDPMs in 2D generation, they have more recently been applied to 3D shape generation, outperforming previous approaches and reaching state-of-the-art results. However, 3D data pose additional challenges, such as the choice of the 3D representation, which impacts design choices and model efficiency. While reaching state-of-the-art results in generation quality, existing 3D DDPM works make little or no use of guidance, mainly being unconditional or class-conditional. In this paper, we present IC3D, the first Image-Conditioned 3D Diffusion model that generates 3D shapes by image guidance. It is also the first 3D DDPM model that adopts voxels as a 3D representation. To guide our DDPM, we present and leverage CISP (Contrastive Image-Shape Pre-training), a model jointly embedding images and shapes by contrastive pre-training, inspired by text-to-image DDPM works. Our generative diffusion model outperforms the state-of-the-art in 3D generation quality and diversity. Furthermore, we show that our generated shapes are preferred by human evaluators to a SoTA single-view 3D reconstruction model in terms of quality and coherence to the query image by running a side-by-side human evaluation.
Abstract:Real-time six degree-of-freedom pose estimation with ground vehicles represents a relevant and well studied topic in robotics, due to its many applications, such as autonomous driving and 3D mapping. Although some systems exist already, they are either not accurate or they struggle in real-time setting. In this paper, we propose a fast, accurate and modular LiDAR SLAM system for both batch and online estimation. We first apply downsampling and outlier removal, to filter out noise and reduce the size of the input point clouds. Filtered clouds are then used for pose tracking and floor detection, to ground-optimize the estimated trajectory. The availability of a pre-tracker, working in parallel with the filtering process, allows to obtain pre-computed odometries, to be used as aids when performing tracking. Efficient loop closure and pose optimization, achieved through a g2o pose graph, are the last steps of the proposed SLAM pipeline. We compare the performance of our system with state-of-the-art point cloud based methods, LOAM, LeGO-LOAM, A-LOAM, LeGO-LOAM-BOR and HDL, and show that the proposed system achieves equal or better accuracy and can easily handle even cases without loops. The comparison is done evaluating the estimated trajectory displacement using the KITTI and RADIATE datasets.