Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marek Gorgon

LiFT: Lightweight, FPGA-tailored 3D object detection based on LiDAR data

Jan 19, 2025

Konrad Lis, Tomasz Kryjak, Marek Gorgon

Abstract:This paper presents LiFT, a lightweight, fully quantized 3D object detection algorithm for LiDAR data, optimized for real-time inference on FPGA platforms. Through an in-depth analysis of FPGA-specific limitations, we identify a set of FPGA-induced constraints that shape the algorithm's design. These include a computational complexity limit of 30 GMACs (billion multiply-accumulate operations), INT8 quantization for weights and activations, 2D cell-based processing instead of 3D voxels, and minimal use of skip connections. To meet these constraints while maximizing performance, LiFT combines novel mechanisms with state-of-the-art techniques such as reparameterizable convolutions and fully sparse architecture. Key innovations include the Dual-bound Pillar Feature Net, which boosts performance without increasing complexity, and an efficient scheme for INT8 quantization of input features. With a computational cost of just 20.73 GMACs, LiFT stands out as one of the few algorithms targeting minimal-complexity 3D object detection. Among comparable methods, LiFT ranks first, achieving an mAP of 51.84% and an NDS of 61.01% on the challenging NuScenes validation dataset. The code will be available at https://github.com/vision-agh/lift.

* The paper has been accepted for the DASIP 2025 workshop in conjunction with the HiPEAC 2025 conference in Barcelona

Via

Access Paper or Ask Questions

PowerYOLO: Mixed Precision Model for Hardware Efficient Object Detection with Event Data

Jul 11, 2024

Dominika Przewlocka-Rus, Tomasz Kryjak, Marek Gorgon

Abstract:The performance of object detection systems in automotive solutions must be as high as possible, with minimal response time and, due to the often battery-powered operation, low energy consumption. When designing such solutions, we therefore face challenges typical for embedded vision systems: the problem of fitting algorithms of high memory and computational complexity into small low-power devices. In this paper we propose PowerYOLO - a mixed precision solution, which targets three essential elements of such application. First, we propose a system based on a Dynamic Vision Sensor (DVS), a novel sensor, that offers low power requirements and operates well in conditions with variable illumination. It is these features that may make event cameras a preferential choice over frame cameras in some applications. Second, to ensure high accuracy and low memory and computational complexity, we propose to use 4-bit width Powers-of-Two (PoT) quantisation for convolution weights of the YOLO detector, with all other parameters quantised linearly. Finally, we embrace from PoT scheme and replace multiplication with bit-shifting to increase the efficiency of hardware acceleration of such solution, with a special convolution-batch normalisation fusion scheme. The use of specific sensor with PoT quantisation and special batch normalisation fusion leads to a unique system with almost 8x reduction in memory complexity and vast computational simplifications, with relation to a standard approach. This efficient system achieves high accuracy of mAP 0.301 on the GEN1 DVS dataset, marking the new state-of-the-art for such compressed model.

* The paper has been accepted for the 27th Euromicro Conference Series on Digital System Design (DSD) 2024

Via

Access Paper or Ask Questions

Optimisation of the PointPillars network for 3D object detection in point clouds

Jul 01, 2020

Joanna Stanisz, Konrad Lis, Tomasz Kryjak, Marek Gorgon

Figure 1 for Optimisation of the PointPillars network for 3D object detection in point clouds

Figure 2 for Optimisation of the PointPillars network for 3D object detection in point clouds

Figure 3 for Optimisation of the PointPillars network for 3D object detection in point clouds

Figure 4 for Optimisation of the PointPillars network for 3D object detection in point clouds

Abstract:In this paper we present our research on the optimisation of a deep neural network for 3D object detection in a point cloud. Techniques like quantisation and pruning available in the Brevitas and PyTorch tools were used. We performed the experiments for the PointPillars network, which offers a reasonable compromise between detection accuracy and calculation complexity. The aim of this work was to propose a variant of the network which we will ultimately implement in an FPGA device. This will allow for real-time LiDAR data processing with low energy consumption. The obtained results indicate that even a significant quantisation from 32-bit floating point to 2-bit integer in the main part of the algorithm, results in 5%-9% decrease of the detection accuracy, while allowing for almost a 16-fold reduction in size of the model.

* 7 pages, 2 figures, submitted to SPA 2020 conference

Via

Access Paper or Ask Questions

Vision based hardware-software real-time control system for autonomous landing of an UAV

Apr 24, 2020

Krzysztof Blachut, Hubert Szolc, Mateusz Wasala, Tomasz Kryjak, Marek Gorgon

Figure 1 for Vision based hardware-software real-time control system for autonomous landing of an UAV

Figure 2 for Vision based hardware-software real-time control system for autonomous landing of an UAV

Figure 3 for Vision based hardware-software real-time control system for autonomous landing of an UAV

Figure 4 for Vision based hardware-software real-time control system for autonomous landing of an UAV

Abstract:In this paper we present a vision based hardware-software control system enabling autonomous landing of a multirotor unmanned aerial vehicle (UAV). It allows the detection of a marked landing pad in real-time for a 1280 x 720 @ 60 fps video stream. In addition, a LiDAR sensor is used to measure the altitude above ground. A heterogeneous Zynq SoC device is used as the computing platform. The solution was tested on a number of sequences and the landing pad was detected with 96% accuracy. This research shows that a reprogrammable heterogeneous computing system is a good solution for UAVs because it enables real-time data stream processing with relatively low energy consumption.

* 7 pages, 9 figures, submitted to MMAR 2020 conference

Via

Access Paper or Ask Questions

Foreground object segmentation in RGB-D data implemented on GPU

Feb 01, 2020

Piotr Janus, Tomasz Kryjak, Marek Gorgon

Figure 1 for Foreground object segmentation in RGB-D data implemented on GPU

Figure 2 for Foreground object segmentation in RGB-D data implemented on GPU

Figure 3 for Foreground object segmentation in RGB-D data implemented on GPU

Figure 4 for Foreground object segmentation in RGB-D data implemented on GPU

Abstract:This paper presents a GPU implementation of two foreground object segmentation algorithms: Gaussian Mixture Model (GMM) and Pixel Based Adaptive Segmenter (PBAS) modified for RGB-D data support. The simultaneous use of colour (RGB) and depth (D) data allows to improve segmentation accuracy, especially in case of colour camouflage, illumination changes and occurrence of shadows. Three GPUs were used to accelerate calculations: embedded NVIDIA Jetson TX2 (Maxwell architecture), mobile NVIDIA GeForce GTX 1050m (Pascal architecture) and efficient NVIDIA RTX 2070 (Turing architecture). Segmentation accuracy comparable to previously published works was obtained. Moreover, the use of a GPU platform allowed to get real-time image processing. In addition, the system has been adapted to work with two RGB-D sensors: RealSense D415 and D435 from Intel.

* 12 pages, 4 figures, submitted to KKA 2020 conference

Via

Access Paper or Ask Questions