Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

May 08, 2024

Lingdong Kong, Xiang Xu, Jiawei Ren, Wenwei Zhang, Liang Pan, Kai Chen, Wei Tsang Ooi, Ziwei Liu

Figure 1 for Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

Figure 2 for Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

Figure 3 for Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

Figure 4 for Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

Share this with someone who'll enjoy it:

Abstract:Efficient data utilization is crucial for advancing 3D scene understanding in autonomous driving, where reliance on heavily human-annotated LiDAR point clouds challenges fully supervised methods. Addressing this, our study extends into semi-supervised learning for LiDAR semantic segmentation, leveraging the intrinsic spatial priors of driving scenes and multi-sensor complements to augment the efficacy of unlabeled datasets. We introduce LaserMix++, an evolved framework that integrates laser beam manipulations from disparate LiDAR scans and incorporates LiDAR-camera correspondences to further assist data-efficient learning. Our framework is tailored to enhance 3D scene consistency regularization by incorporating multi-modality, including 1) multi-modal LaserMix operation for fine-grained cross-sensor interactions; 2) camera-to-LiDAR feature distillation that enhances LiDAR feature learning; and 3) language-driven knowledge guidance generating auxiliary supervisions using open-vocabulary models. The versatility of LaserMix++ enables applications across LiDAR representations, establishing it as a universally applicable solution. Our framework is rigorously validated through theoretical analysis and extensive experiments on popular driving perception datasets. Results demonstrate that LaserMix++ markedly outperforms fully supervised alternatives, achieving comparable accuracy with five times fewer annotations and significantly improving the supervised-only baselines. This substantial advancement underscores the potential of semi-supervised approaches in reducing the reliance on extensive labeled data in LiDAR-based 3D scene understanding systems.

* Preprint; 17 pages, 6 figures, 8 tables; Code at https://github.com/ldkong1205/LaserMix

View paper on

Share this with someone who'll enjoy it:

Title:Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

Paper and Code