Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jong-Chan Kim

Demand Layering for Real-Time DNN Inference with Minimized Memory Usage

Oct 08, 2022

Mingoo Ji, Saehanseul Yi, Changjin Koo, Sol Ahn, Dongjoo Seo, Nikil Dutt, Jong-Chan Kim

Figure 1 for Demand Layering for Real-Time DNN Inference with Minimized Memory Usage

Figure 2 for Demand Layering for Real-Time DNN Inference with Minimized Memory Usage

Figure 3 for Demand Layering for Real-Time DNN Inference with Minimized Memory Usage

Figure 4 for Demand Layering for Real-Time DNN Inference with Minimized Memory Usage

Abstract:When executing a deep neural network (DNN), its model parameters are loaded into GPU memory before execution, incurring a significant GPU memory burden. There are studies that reduce GPU memory usage by exploiting CPU memory as a swap device. However, this approach is not applicable in most embedded systems with integrated GPUs where CPU and GPU share a common memory. In this regard, we present Demand Layering, which employs a fast solid-state drive (SSD) as a co-running partner of a GPU and exploits the layer-by-layer execution of DNNs. In our approach, a DNN is loaded and executed in a layer-by-layer manner, minimizing the memory usage to the order of a single layer. Also, we developed a pipeline architecture that hides most additional delays caused by the interleaved parameter loadings alongside layer executions. Our implementation shows a 96.5% memory reduction with just 14.8% delay overhead on average for representative DNNs. Furthermore, by exploiting the memory-delay tradeoff, near-zero delay overhead (under 1 ms) can be achieved with a slightly increased memory usage (still an 88.4% reduction), showing the great potential of Demand Layering.

* 14 pages, 16 figures. Accepted to the 43rd IEEE Real-Time Systems Symposium (RTSS), 2022

Via

Access Paper or Ask Questions

Cyclops: Open Platform for Scale Truck Platooning

Mar 03, 2022

Hyeongyu Lee, Jaegeun Park, Changjin Koo, Jong-Chan Kim, Yongsoon Eun

Figure 1 for Cyclops: Open Platform for Scale Truck Platooning

Figure 2 for Cyclops: Open Platform for Scale Truck Platooning

Figure 3 for Cyclops: Open Platform for Scale Truck Platooning

Figure 4 for Cyclops: Open Platform for Scale Truck Platooning

Abstract:Cyclops, introduced in this paper, is an open research platform for everyone that wants to validate novel ideas and approaches in the area of self-driving heavy-duty vehicle platooning. The platform consists of multiple 1/14 scale semi-trailer trucks, a scale proving ground, and associated computing, communication and control modules that enable self-driving on the proving ground. A perception system for each vehicle is composed of a lidar-based object tracking system and a lane detection/control system. The former is to maintain the gap to the leading vehicle and the latter is to maintain the vehicle within the lane by steering control. The lane detection system is optimized for truck platooning where the field of view of the front-facing camera is severely limited due to a small gap to the leading vehicle. This platform is particularly amenable to validate mitigation strategies for safety-critical situations. Indeed, a simplex structure is adopted in the embedded module for testing various fail safe operations. We illustrate a scenario where camera sensor fails in the perception system but the vehicle operates at a reduced capacity to a graceful stop. Details of the Cyclops including 3D CAD designs and algorithm source codes are released for those who want to build similar testbeds.

Via

Access Paper or Ask Questions

Energy-Efficient Adaptive System Reconfiguration for Dynamic Deadlines in Autonomous Driving

Jun 03, 2021

Saehanseul Yi, Tae-Wook Kim, Jong-Chan Kim, Nikil Dutt

Figure 1 for Energy-Efficient Adaptive System Reconfiguration for Dynamic Deadlines in Autonomous Driving

Figure 2 for Energy-Efficient Adaptive System Reconfiguration for Dynamic Deadlines in Autonomous Driving

Figure 3 for Energy-Efficient Adaptive System Reconfiguration for Dynamic Deadlines in Autonomous Driving

Figure 4 for Energy-Efficient Adaptive System Reconfiguration for Dynamic Deadlines in Autonomous Driving

Abstract:The increasing computing demands of autonomous driving applications make energy optimizations critical for reducing battery capacity and vehicle weight. Current energy optimization methods typically target traditional real-time systems with static deadlines, resulting in conservative energy savings that are unable to exploit additional energy optimizations due to dynamic deadlines arising from the vehicle's change in velocity and driving context. We present an adaptive system optimization and reconfiguration approach that dynamically adapts the scheduling parameters and processor speeds to satisfy dynamic deadlines while consuming as little energy as possible. Our experimental results with an autonomous driving task set from Bosch and real-world driving data show energy reductions up to 46.4% on average in typical dynamic driving scenarios compared with traditional static energy optimization methods, demonstrating great potential for dynamic energy optimization gains by exploiting dynamic deadlines.

* IEEE ISORC 2021

Via

Access Paper or Ask Questions

R-TOD: Real-Time Object Detector with Minimized End-to-End Delay for Autonomous Driving

Oct 23, 2020

Wonseok Jang, Hansaem Jeong, Kyungtae Kang, Nikil Dutt, Jong-Chan Kim

Figure 1 for R-TOD: Real-Time Object Detector with Minimized End-to-End Delay for Autonomous Driving

Figure 2 for R-TOD: Real-Time Object Detector with Minimized End-to-End Delay for Autonomous Driving

Figure 3 for R-TOD: Real-Time Object Detector with Minimized End-to-End Delay for Autonomous Driving

Figure 4 for R-TOD: Real-Time Object Detector with Minimized End-to-End Delay for Autonomous Driving

Abstract:For realizing safe autonomous driving, the end-to-end delays of real-time object detection systems should be thoroughly analyzed and minimized. However, despite recent development of neural networks with minimized inference delays, surprisingly little attention has been paid to their end-to-end delays from an object's appearance until its detection is reported. With this motivation, this paper aims to provide more comprehensive understanding of the end-to-end delay, through which precise best- and worst-case delay predictions are formulated, and three optimization methods are implemented: (i) on-demand capture, (ii) zero-slack pipeline, and (iii) contention-free pipeline. Our experimental results show a 76% reduction in the end-to-end delay of Darknet YOLO (You Only Look Once) v3 (from 1070 ms to 261 ms), thereby demonstrating the great potential of exploiting the end-to-end delay analysis for autonomous driving. Furthermore, as we only modify the system architecture and do not change the neural network architecture itself, our approach incurs no penalty on the detection accuracy.

* 14 pages, 16 figures. Accepted to the 41st IEEE Real-Time Systems Symposium (RTSS), 2020

Via

Access Paper or Ask Questions