Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kamil Żywanowski

Real-Time Onboard Object Detection for Augmented Reality: Enhancing Head-Mounted Display with YOLOv8

Jun 06, 2023

Mikołaj Łysakowski, Kamil Żywanowski, Adam Banaszczyk, Michał R. Nowicki, Piotr Skrzypczyński, Sławomir K. Tadeja

Abstract:This paper introduces a software architecture for real-time object detection using machine learning (ML) in an augmented reality (AR) environment. Our approach uses the recent state-of-the-art YOLOv8 network that runs onboard on the Microsoft HoloLens 2 head-mounted display (HMD). The primary motivation behind this research is to enable the application of advanced ML models for enhanced perception and situational awareness with a wearable, hands-free AR platform. We show the image processing pipeline for the YOLOv8 model and the techniques used to make it real-time on the resource-limited edge computing platform of the headset. The experimental results demonstrate that our solution achieves real-time processing without needing offloading tasks to the cloud or any other external servers while retaining satisfactory accuracy regarding the usual mAP metric and measured qualitative performance

Via

Access Paper or Ask Questions

MinkLoc3D-SI: 3D LiDAR place recognition with sparse convolutions, spherical coordinates, and intensity

Dec 27, 2021

Kamil Żywanowski, Adam Banaszczyk, Michał R. Nowicki, Jacek Komorowski

Figure 1 for MinkLoc3D-SI: 3D LiDAR place recognition with sparse convolutions, spherical coordinates, and intensity

Figure 2 for MinkLoc3D-SI: 3D LiDAR place recognition with sparse convolutions, spherical coordinates, and intensity

Figure 3 for MinkLoc3D-SI: 3D LiDAR place recognition with sparse convolutions, spherical coordinates, and intensity

Figure 4 for MinkLoc3D-SI: 3D LiDAR place recognition with sparse convolutions, spherical coordinates, and intensity

Abstract:The 3D LiDAR place recognition aims to estimate a coarse localization in a previously seen environment based on a single scan from a rotating 3D LiDAR sensor. The existing solutions to this problem include hand-crafted point cloud descriptors (e.g., ScanContext, M2DP, LiDAR IRIS) and deep learning-based solutions (e.g., PointNetVLAD, PCAN, LPDNet, DAGC, MinkLoc3D), which are often only evaluated on accumulated 2D scans from the Oxford RobotCar dataset. We introduce MinkLoc3D-SI, a sparse convolution-based solution that utilizes spherical coordinates of 3D points and processes the intensity of 3D LiDAR measurements, improving the performance when a single 3D LiDAR scan is used. Our method integrates the improvements typical for hand-crafted descriptors (like ScanContext) with the most efficient 3D sparse convolutions (MinkLoc3D). Our experiments show improved results on single scans from 3D LiDARs (USyd Campus dataset) and great generalization ability (KITTI dataset). Using intensity information on accumulated 2D scans (RobotCar Intensity dataset) improves the performance, even though spherical representation doesn't produce a noticeable improvement. As a result, MinkLoc3D-SI is suited for single scans obtained from a 3D LiDAR, making it applicable in autonomous vehicles.

Via

Access Paper or Ask Questions

Comparison of camera-based and 3D LiDAR-based loop closures across weather conditions

Sep 08, 2020

Kamil Żywanowski, Adam Banaszczyk, Michał Nowicki

Figure 1 for Comparison of camera-based and 3D LiDAR-based loop closures across weather conditions

Figure 2 for Comparison of camera-based and 3D LiDAR-based loop closures across weather conditions

Figure 3 for Comparison of camera-based and 3D LiDAR-based loop closures across weather conditions

Figure 4 for Comparison of camera-based and 3D LiDAR-based loop closures across weather conditions

Abstract:Loop closure based on camera images provides excellent results on benchmarking datasets, but might struggle in real-world adverse weather conditions like direct sun, rain, fog, or just darkness at night. In automotive applications, the sensory setups include 3D LiDARs that provide information complementary to cameras. The presented article focuses on the evaluation of camera-based, LiDAR-based, and joint camera-LiDAR-based loop closures applying a similar processing pipeline consisting of a neural network under varying weather conditions using the newly available USyd dataset. The experiments performed on the same trajectories in diverse weather conditions over 50 weeks prove that a 16-line 3D LiDAR can be used to supplement image-based loop closure to increase loop closure performance. This proves that there is a need for more research into loop closures performed with multi-sensory setups.

* Accepted for ICARCV conference

Via

Access Paper or Ask Questions