Abstract:This paper introduces a software architecture for real-time object detection using machine learning (ML) in an augmented reality (AR) environment. Our approach uses the recent state-of-the-art YOLOv8 network that runs onboard on the Microsoft HoloLens 2 head-mounted display (HMD). The primary motivation behind this research is to enable the application of advanced ML models for enhanced perception and situational awareness with a wearable, hands-free AR platform. We show the image processing pipeline for the YOLOv8 model and the techniques used to make it real-time on the resource-limited edge computing platform of the headset. The experimental results demonstrate that our solution achieves real-time processing without needing offloading tasks to the cloud or any other external servers while retaining satisfactory accuracy regarding the usual mAP metric and measured qualitative performance
Abstract:The 3D LiDAR place recognition aims to estimate a coarse localization in a previously seen environment based on a single scan from a rotating 3D LiDAR sensor. The existing solutions to this problem include hand-crafted point cloud descriptors (e.g., ScanContext, M2DP, LiDAR IRIS) and deep learning-based solutions (e.g., PointNetVLAD, PCAN, LPDNet, DAGC, MinkLoc3D), which are often only evaluated on accumulated 2D scans from the Oxford RobotCar dataset. We introduce MinkLoc3D-SI, a sparse convolution-based solution that utilizes spherical coordinates of 3D points and processes the intensity of 3D LiDAR measurements, improving the performance when a single 3D LiDAR scan is used. Our method integrates the improvements typical for hand-crafted descriptors (like ScanContext) with the most efficient 3D sparse convolutions (MinkLoc3D). Our experiments show improved results on single scans from 3D LiDARs (USyd Campus dataset) and great generalization ability (KITTI dataset). Using intensity information on accumulated 2D scans (RobotCar Intensity dataset) improves the performance, even though spherical representation doesn't produce a noticeable improvement. As a result, MinkLoc3D-SI is suited for single scans obtained from a 3D LiDAR, making it applicable in autonomous vehicles.
Abstract:Loop closure based on camera images provides excellent results on benchmarking datasets, but might struggle in real-world adverse weather conditions like direct sun, rain, fog, or just darkness at night. In automotive applications, the sensory setups include 3D LiDARs that provide information complementary to cameras. The presented article focuses on the evaluation of camera-based, LiDAR-based, and joint camera-LiDAR-based loop closures applying a similar processing pipeline consisting of a neural network under varying weather conditions using the newly available USyd dataset. The experiments performed on the same trajectories in diverse weather conditions over 50 weeks prove that a 16-line 3D LiDAR can be used to supplement image-based loop closure to increase loop closure performance. This proves that there is a need for more research into loop closures performed with multi-sensory setups.