Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Topic:Point Cloud Registration

FACT: Multinomial Misalignment Classification for Point Cloud Registration

Apr 09, 2025

Ludvig Dillén, Per-Erik Forssén, Johan Edstedt

Abstract:We present FACT, a method for predicting alignment quality (i.e., registration error) of registered lidar point cloud pairs. This is useful e.g. for quality assurance of large, automatically registered 3D models. FACT extracts local features from a registered pair and processes them with a point transformer-based network to predict a misalignment class. We generalize prior work that study binary alignment classification of registration errors, by recasting it as multinomial misalignment classification. To achieve this, we introduce a custom regression-by-classification loss function that combines the cross-entropy and Wasserstein losses, and demonstrate that it outperforms both direct regression and prior binary classification. FACT successfully classifies point-cloud pairs registered with both the classical ICP and GeoTransformer, while other choices, such as standard point-cloud-quality metrics and registration residuals are shown to be poor choices for predicting misalignment. On a synthetically perturbed point-cloud task introduced by the CorAl method, we show that FACT achieves substantially better performance than CorAl. Finally, we demonstrate how FACT can assist experts in correcting misaligned point-cloud maps. Our code is available at https://github.com/LudvigDillen/FACT_for_PCMC.

* Accepted at SCIA 2025 (the Scandinavian Conference on Image Analysis 2025)

Via

Access Paper or Ask Questions

Investigating Vision-Language Model for Point Cloud-based Vehicle Classification

Apr 10, 2025

Yiqiao Li, Jie Wei, Camille Kamga

Abstract:Heavy-duty trucks pose significant safety challenges due to their large size and limited maneuverability compared to passenger vehicles. A deeper understanding of truck characteristics is essential for enhancing the safety perspective of cooperative autonomous driving. Traditional LiDAR-based truck classification methods rely on extensive manual annotations, which makes them labor-intensive and costly. The rapid advancement of large language models (LLMs) trained on massive datasets presents an opportunity to leverage their few-shot learning capabilities for truck classification. However, existing vision-language models (VLMs) are primarily trained on image datasets, which makes it challenging to directly process point cloud data. This study introduces a novel framework that integrates roadside LiDAR point cloud data with VLMs to facilitate efficient and accurate truck classification, which supports cooperative and safe driving environments. This study introduces three key innovations: (1) leveraging real-world LiDAR datasets for model development, (2) designing a preprocessing pipeline to adapt point cloud data for VLM input, including point cloud registration for dense 3D rendering and mathematical morphological techniques to enhance feature representation, and (3) utilizing in-context learning with few-shot prompting to enable vehicle classification with minimally labeled training data. Experimental results demonstrate encouraging performance of this method and present its potential to reduce annotation efforts while improving classification accuracy.

* 5 pages,3 figures, 1 table, CVPR DriveX workshop

Via

Access Paper or Ask Questions

A Pointcloud Registration Framework for Relocalization in Subterranean Environments

Apr 09, 2025

David Akhihiero, Jason N. Gross

Abstract:Relocalization, the process of re-establishing a robot's position within an environment, is crucial for ensuring accurate navigation and task execution when external positioning information, such as GPS, is unavailable or has been lost. Subterranean environments present significant challenges for relocalization due to limited external positioning information, poor lighting that affects camera localization, irregular and often non-distinct surfaces, and dust, which can introduce noise and occlusion in sensor data. In this work, we propose a robust, computationally friendly framework for relocalization through point cloud registration utilizing a prior point cloud map. The framework employs Intrinsic Shape Signatures (ISS) to select feature points in both the target and prior point clouds. The Fast Point Feature Histogram (FPFH) algorithm is utilized to create descriptors for these feature points, and matching these descriptors yields correspondences between the point clouds. A 3D transformation is estimated using the matched points, which initializes a Normal Distribution Transform (NDT) registration. The transformation result from NDT is further refined using the Iterative Closest Point (ICP) registration algorithm. This framework enhances registration accuracy even in challenging conditions, such as dust interference and significant initial transformations between the target and source, making it suitable for autonomous robots operating in underground mines and tunnels. This framework was validated with experiments in simulated and real-world mine datasets, demonstrating its potential for improving relocalization.

Via

Access Paper or Ask Questions

EasyREG: Easy Depth-Based Markerless Registration and Tracking using Augmented Reality Device for Surgical Guidance

Apr 13, 2025

Yue Yang, Christoph Leuze, Brian Hargreaves, Bruce Daniel, Fred Baik

Abstract:The use of Augmented Reality (AR) devices for surgical guidance has gained increasing traction in the medical field. Traditional registration methods often rely on external fiducial markers to achieve high accuracy and real-time performance. However, these markers introduce cumbersome calibration procedures and can be challenging to deploy in clinical settings. While commercial solutions have attempted real-time markerless tracking using the native RGB cameras of AR devices, their accuracy remains questionable for medical guidance, primarily due to occlusions and significant outliers between the live sensor data and the preoperative target anatomy point cloud derived from MRI or CT scans. In this work, we present a markerless framework that relies only on the depth sensor of AR devices and consists of two modules: a registration module for high-precision, outlier-robust target anatomy localization, and a tracking module for real-time pose estimation. The registration module integrates depth sensor error correction, a human-in-the-loop region filtering technique, and a robust global alignment with curvature-aware feature sampling, followed by local ICP refinement, for markerless alignment of preoperative models with patient anatomy. The tracking module employs a fast and robust registration algorithm that uses the initial pose from the registration module to estimate the target pose in real-time. We comprehensively evaluated the performance of both modules through simulation and real-world measurements. The results indicate that our markerless system achieves superior performance for registration and comparable performance for tracking to industrial solutions. The two-module design makes our system a one-stop solution for surgical procedures where the target anatomy moves or stays static during surgery.

Via

Access Paper or Ask Questions

Implementation of a Zed 2i Stereo Camera for High-Frequency Shoreline Change and Coastal Elevation Monitoring

Apr 08, 2025

José A. Pilartes-Congo, Matthew Kastl, Michael J. Starek, Marina Vicens-Miquel, Philippe Tissot

Abstract:The increasing population, thus financial interests, in coastal areas have increased the need to monitor coastal elevation and shoreline change. Though several resources exist to obtain this information, they often lack the required temporal resolution for short-term monitoring (e.g., every hour). To address this issue, this study implements a low-cost ZED 2i stereo camera system and close-range photogrammetry to collect images for generating 3D point clouds, digital surface models (DSMs) of beach elevation, and georectified imagery at a localized scale and high temporal resolution. The main contributions of this study are (i) intrinsic camera calibration, (ii) georectification and registration of acquired imagery and point cloud, (iii) generation of the DSM of the beach elevation, and (iv) a comparison of derived products against those from uncrewed aircraft system structure-from-motion photogrammetry. Preliminary results show that despite its limitations, the ZED 2i can provide the desired mapping products at localized and high temporal scales. The system achieved a mean reprojection error of 0.20 px, a point cloud registration of 27 cm, a vertical error of 37.56 cm relative to ground truth, and georectification root mean square errors of 2.67 cm and 2.81 cm for x and y.

* IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium
* Published in IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium

Via

Access Paper or Ask Questions

Bridge 2D-3D: Uncertainty-aware Hierarchical Registration Network with Domain Alignment

Apr 02, 2025

Zhixin Cheng, Jiacheng Deng, Xinjun Li, Baoqun Yin, Tianzhu Zhang

Abstract:The method for image-to-point cloud registration typically determines the rigid transformation using a coarse-to-fine pipeline. However, directly and uniformly matching image patches with point cloud patches may lead to focusing on incorrect noise patches during matching while ignoring key ones. Moreover, due to the significant differences between image and point cloud modalities, it may be challenging to bridge the domain gap without specific improvements in design. To address the above issues, we innovatively propose the Uncertainty-aware Hierarchical Matching Module (UHMM) and the Adversarial Modal Alignment Module (AMAM). Within the UHMM, we model the uncertainty of critical information in image patches and facilitate multi-level fusion interactions between image and point cloud features. In the AMAM, we design an adversarial approach to reduce the domain gap between image and point cloud. Extensive experiments and ablation studies on RGB-D Scene V2 and 7-Scenes benchmarks demonstrate the superiority of our method, making it a state-of-the-art approach for image-to-point cloud registration tasks.

* AAAI2025accept

Via

Access Paper or Ask Questions

Robust Human Registration with Body Part Segmentation on Noisy Point Clouds

Apr 04, 2025

Kai Lascheit, Daniel Barath, Marc Pollefeys, Leonidas Guibas, Francis Engelmann

Abstract:Registering human meshes to 3D point clouds is essential for applications such as augmented reality and human-robot interaction but often yields imprecise results due to noise and background clutter in real-world data. We introduce a hybrid approach that incorporates body-part segmentation into the mesh fitting process, enhancing both human pose estimation and segmentation accuracy. Our method first assigns body part labels to individual points, which then guide a two-step SMPL-X fitting: initial pose and orientation estimation using body part centroids, followed by global refinement of the point cloud alignment. Additionally, we demonstrate that the fitted human mesh can refine body part labels, leading to improved segmentation. Evaluations on the cluttered and noisy real-world datasets InterCap, EgoBody, and BEHAVE show that our approach significantly outperforms prior methods in both pose estimation and segmentation accuracy. Code and results are available on our project website: https://segfit.github.io

Via

Access Paper or Ask Questions

CaLiV: LiDAR-to-Vehicle Calibration of Arbitrary Sensor Setups via Object Reconstruction

Mar 31, 2025

Ilir Tahiraj, Markus Edinger, Dominik Kulmer, Markus Lienkamp

Abstract:In autonomous systems, sensor calibration is essential for a safe and efficient navigation in dynamic environments. Accurate calibration is a prerequisite for reliable perception and planning tasks such as object detection and obstacle avoidance. Many existing LiDAR calibration methods require overlapping fields of view, while others use external sensing devices or postulate a feature-rich environment. In addition, Sensor-to-Vehicle calibration is not supported by the vast majority of calibration algorithms. In this work, we propose a novel target-based technique for extrinsic Sensor-to-Sensor and Sensor-to-Vehicle calibration of multi-LiDAR systems called CaLiV. This algorithm works for non-overlapping FoVs, as well as arbitrary calibration targets, and does not require any external sensing devices. First, we apply motion to produce FoV overlaps and utilize a simple unscented Kalman filter to obtain vehicle poses. Then, we use the Gaussian mixture model-based registration framework GMMCalib to align the point clouds in a common calibration frame. Finally, we reduce the task of recovering the sensor extrinsics to a minimization problem. We show that both translational and rotational Sensor-to-Sensor errors can be solved accurately by our method. In addition, all Sensor-to-Vehicle rotation angles can also be calibrated with high accuracy. We validate the simulation results in real-world experiments. The code is open source and available on https://github.com/TUMFTM/CaLiV.

Via

Access Paper or Ask Questions

MT-PCR: Leveraging Modality Transformation for Large-Scale Point Cloud Registration with Limited Overlap

Mar 17, 2025

Yilong Wu, Yifan Duan, Yuxi Chen, Xinran Zhang, Yedong Shen, Jianmin Ji, Yanyong Zhang, Lu Zhang

Abstract:Large-scale scene point cloud registration with limited overlap is a challenging task due to computational load and constrained data acquisition. To tackle these issues, we propose a point cloud registration method, MT-PCR, based on Modality Transformation. MT-PCR leverages a BEV capturing the maximal overlap information to improve the accuracy and utilizes images to provide complementary spatial features. Specifically, MT-PCR converts 3D point clouds to BEV images and eastimates correspondence by 2D image keypoints extraction and matching. Subsequently, the 2D correspondence estimates are then transformed back to 3D point clouds using inverse mapping. We have applied MT-PCR to Terrestrial Laser Scanning and Aerial Laser Scanning point cloud registration on the GrAco dataset, involving 8 low-overlap, square-kilometer scale registration scenarios. Experiments and comparisons with commonly used methods demonstrate that MT-PCR can achieve superior accuracy and robustness in large-scale scenes with limited overlap.

* 8 pages, 5 figures, ICRA2025

Via

Access Paper or Ask Questions

R2LDM: An Efficient 4D Radar Super-Resolution Framework Leveraging Diffusion Model

Mar 21, 2025

Boyuan Zheng, Shouyi Lu, Renbo Huang, Minqing Huang, Fan Lu, Wei Tian, Guirong Zhuo, Lu Xiong

Abstract:We introduce R2LDM, an innovative approach for generating dense and accurate 4D radar point clouds, guided by corresponding LiDAR point clouds. Instead of utilizing range images or bird's eye view (BEV) images, we represent both LiDAR and 4D radar point clouds using voxel features, which more effectively capture 3D shape information. Subsequently, we propose the Latent Voxel Diffusion Model (LVDM), which performs the diffusion process in the latent space. Additionally, a novel Latent Point Cloud Reconstruction (LPCR) module is utilized to reconstruct point clouds from high-dimensional latent voxel features. As a result, R2LDM effectively generates LiDAR-like point clouds from paired raw radar data. We evaluate our approach on two different datasets, and the experimental results demonstrate that our model achieves 6- to 10-fold densification of radar point clouds, outperforming state-of-the-art baselines in 4D radar point cloud super-resolution. Furthermore, the enhanced radar point clouds generated by our method significantly improve downstream tasks, achieving up to 31.7% improvement in point cloud registration recall rate and 24.9% improvement in object detection accuracy.

Via

Access Paper or Ask Questions

Topic:Point Cloud Registration

Papers and Code