Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hsin-Hsuan Sung

BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs

May 04, 2023

Jou-An Chen, Hsin-Hsuan Sung, Xipeng Shen, Sutanay Choudhury, Ang Li

Figure 1 for BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs

Figure 2 for BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs

Figure 3 for BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs

Figure 4 for BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs

Abstract:Recent studies have shown that Binary Graph Neural Networks (GNNs) are promising for saving computations of GNNs through binarized tensors. Prior work, however, mainly focused on algorithm designs or training techniques, leaving it open to how to materialize the performance potential on accelerator hardware fully. This work redesigns the binary GNN inference backend from the efficiency perspective. It fills the gap by proposing a series of abstractions and techniques to map binary GNNs and their computations best to fit the nature of bit manipulations on GPUs. Results on real-world graphs with GCNs, GraphSAGE, and GraphSAINT show that the proposed techniques outperform state-of-the-art binary GNN implementations by 8-22X with the same accuracy maintained. BitGNN code is publicly available.

* To appear in the International Conference on Supercomputing (ICS '23)

Via

Access Paper or Ask Questions

Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card

Oct 12, 2021

Hsin-Hsuan Sung, Yuanchao Xu, Jiexiong Guan, Wei Niu, Shaoshan Liu, Bin Ren, Yanzhi Wang, Xipeng Shen

Figure 1 for Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card

Figure 2 for Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card

Figure 3 for Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card

Figure 4 for Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card

Abstract:Autonomous driving is of great interest in both research and industry. The high cost has been one of the major roadblocks that slow down the development and adoption of autonomous driving in practice. This paper, for the first-time, shows that it is possible to run level-4 (i.e., fully autonomous driving) software on a single off-the-shelf card (Jetson AGX Xavier) for less than $1k, an order of magnitude less than the state-of-the-art systems, while meeting all the requirements of latency. The success comes from the resolution of some important issues shared by existing practices through a series of measures and innovations. The study overturns the common perceptions of the computing resources required by level-4 autonomous driving, points out a promising path for the industry to lower the cost, and suggests a number of research opportunities for rethinking the architecture, software design, and optimizations of autonomous driving.

* under conference review

Via

Access Paper or Ask Questions

Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device

Dec 26, 2020

Pu Zhao, Wei Niu, Geng Yuan, Yuxuan Cai, Hsin-Hsuan Sung, Wujie Wen, Sijia Liu, Xipeng Shen, Bin Ren, Yanzhi Wang(+1 more)

Figure 1 for Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device

Figure 2 for Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device

Figure 3 for Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device

Figure 4 for Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device

Abstract:3D object detection is an important task, especially in the autonomous driving application domain. However, it is challenging to support the real-time performance with the limited computation and memory resources on edge-computing devices in self-driving cars. To achieve this, we propose a compiler-aware unified framework incorporating network enhancement and pruning search with the reinforcement learning techniques, to enable real-time inference of 3D object detection on the resource-limited edge-computing devices. Specifically, a generator Recurrent Neural Network (RNN) is employed to provide the unified scheme for both network enhancement and pruning search automatically, without human expertise and assistance. And the evaluated performance of the unified schemes can be fed back to train the generator RNN. The experimental results demonstrate that the proposed framework firstly achieves real-time 3D object detection on mobile devices (Samsung Galaxy S20 phone) with competitive detection performance.

Via

Access Paper or Ask Questions