Abstract:Perception systems of autonomous vehicles are susceptible to occlusion, especially when examined from a vehicle-centric perspective. Such occlusion can lead to overlooked object detections, e.g., larger vehicles such as trucks or buses may create blind spots where cyclists or pedestrians could be obscured, accentuating the safety concerns associated with such perception system limitations. To mitigate these challenges, the vehicle-to-everything (V2X) paradigm suggests employing an infrastructure-side perception system (IPS) to complement autonomous vehicles with a broader perceptual scope. Nevertheless, the scarcity of real-world 3D infrastructure-side datasets constrains the advancement of V2X technologies. To bridge these gaps, this paper introduces a new 3D infrastructure-side collaborative perception dataset, abbreviated as inscope. Notably, InScope is the first dataset dedicated to addressing occlusion challenges by strategically deploying multiple-position Light Detection and Ranging (LiDAR) systems on the infrastructure side. Specifically, InScope encapsulates a 20-day capture duration with 303 tracking trajectories and 187,787 3D bounding boxes annotated by experts. Through analysis of benchmarks, four different benchmarks are presented for open traffic scenarios, including collaborative 3D object detection, multisource data fusion, data domain transfer, and 3D multiobject tracking tasks. Additionally, a new metric is designed to quantify the impact of occlusion, facilitating the evaluation of detection degradation ratios among various algorithms. The Experimental findings showcase the enhanced performance of leveraging InScope to assist in detecting and tracking 3D multiobjects in real-world scenarios, particularly in tracking obscured, small, and distant objects. The dataset and benchmarks are available at https://github.com/xf-zh/InScope.
Abstract:LiDAR odometry is one of the essential parts of LiDAR simultaneous localization and mapping (SLAM). However, existing LiDAR odometry tends to match a new scan simply iteratively with previous fixed-pose scans, gradually accumulating errors. Furthermore, as an effective joint optimization mechanism, bundle adjustment (BA) cannot be directly introduced into real-time odometry due to the intensive computation of large-scale global landmarks. Therefore, this letter designs a new strategy named a landmark map for bundle adjustment odometry (LMBAO) in LiDAR SLAM to solve these problems. First, BA-based odometry is further developed with an active landmark maintenance strategy for a more accurate local registration and avoiding cumulative errors. Specifically, this paper keeps entire stable landmarks on the map instead of just their feature points in the sliding window and deletes the landmarks according to their active grade. Next, the sliding window length is reduced, and marginalization is performed to retain the scans outside the window but corresponding to active landmarks on the map, greatly simplifying the computation and improving the real-time properties. In addition, experiments on three challenging datasets show that our algorithm achieves real-time performance in outdoor driving and outperforms state-of-the-art LiDAR SLAM algorithms, including Lego-LOAM and VLOM.
Abstract:Previous studies have shown the great potential of capsule networks for the spatial contextual feature extraction from {hyperspectral images (HSIs)}. However, the sampling locations of the convolutional kernels of capsules are fixed and cannot be adaptively changed according to the inconsistent semantic information of HSIs. Based on this observation, this paper proposes an adaptive spatial pattern capsule network (ASPCNet) architecture by developing an adaptive spatial pattern (ASP) unit, that can rotate the sampling location of convolutional kernels on the basis of an enlarged receptive field. Note that this unit can learn more discriminative representations of HSIs with fewer parameters. Specifically, two cascaded ASP-based convolution operations (ASPConvs) are applied to input images to learn relatively high-level semantic features, transmitting hierarchical structures among capsules more accurately than the use of the most fundamental features. Furthermore, the semantic features are fed into ASP-based conv-capsule operations (ASPCaps) to explore the shapes of objects among the capsules in an adaptive manner, further exploring the potential of capsule networks. Finally, the class labels of image patches centered on test samples can be determined according to the fully connected capsule layer. Experiments on three public datasets demonstrate that ASPCNet can yield competitive performance with higher accuracies than state-of-the-art methods.