Abstract:The proposal of Pseudo-Lidar representation has significantly narrowed the gap between visual-based and active Lidar-based 3D object detection. However, current researches exclusively focus on pushing the accuracy improvement of Pseudo-Lidar by taking the advantage of complex and time-consuming neural networks. Seldom explore the profound characteristics of Pseudo-Lidar representation to obtain the promoting opportunities. In this paper, we dive deep into the pseudo Lidar representation and argue that the performance of 3D object detection is not fully dependent on the high precision stereo depth estimation. We demonstrate that even for the unreliable depth estimation, with proper data processing and refining, it can achieve comparable 3D object detection accuracy. With this finding, we further show the possibility that utilizing fast but inaccurate stereo matching algorithms in the Pseudo-Lidar system to achieve low latency responsiveness. In the experiments, we develop a system with a less powerful stereo matching predictor and adopt the proposed refinement schemes to improve the accuracy. The evaluation on the KITTI benchmark shows that the presented system achieves competitive accuracy to the state-of-the-art approaches with only 23 ms computing, showing it is a suitable candidate for deploying to real car-hold applications.
Abstract:Over the last years, a great success of deep neural networks (DNNs) has been witnessed in computer vision and other fields. However, performance and power constraints make it still challenging to deploy DNNs on mobile devices due to their high computational complexity. Binary neural networks (BNNs) have been demonstrated as a promising solution to achieve this goal by using bit-wise operations to replace most arithmetic operations. Currently, existing GPU-accelerated implementations of BNNs are only tailored for desktop platforms. Due to architecture differences, mere porting of such implementations to mobile devices yields suboptimal performance or is impossible in some cases. In this paper, we propose PhoneBit, a GPU-accelerated BNN inference engine for Android-based mobile devices that fully exploits the computing power of BNNs on mobile GPUs. PhoneBit provides a set of operator-level optimizations including locality-friendly data layout, bit packing with vectorization and layers integration for efficient binary convolution. We also provide a detailed implementation and parallelization optimization for PhoneBit to optimally utilize the memory bandwidth and computing power of mobile GPUs. We evaluate PhoneBit with AlexNet, YOLOv2 Tiny and VGG16 with their binary version. Our experiment results show that PhoneBit can achieve significant speedup and energy efficiency compared with state-of-the-art frameworks for mobile devices.