Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting

Dec 21, 2021

Minghai Qin, Tianyun Zhang, Fei Sun, Yen-Kuang Chen, Makan Fardad, Yanzhi Wang, Yuan Xie

Figure 1 for Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting

Figure 2 for Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting

Figure 3 for Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting

Figure 4 for Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting

Share this with someone who'll enjoy it:

Abstract:Deep neural networks (DNNs) have shown to provide superb performance in many real life applications, but their large computation cost and storage requirement have prevented them from being deployed to many edge and internet-of-things (IoT) devices. Sparse deep neural networks, whose majority weight parameters are zeros, can substantially reduce the computation complexity and memory consumption of the models. In real-use scenarios, devices may suffer from large fluctuations of the available computation and memory resources under different environment, and the quality of service (QoS) is difficult to maintain due to the long tail inferences with large latency. Facing the real-life challenges, we propose to train a sparse model that supports multiple sparse levels. That is, a hierarchical structure of weights are satisfied such that the locations and the values of the non-zero parameters of the more-sparse sub-model area subset of the less-sparse sub-model. In this way, one can dynamically select the appropriate sparsity level during inference, while the storage cost is capped by the least sparse sub-model. We have verified our methodologies on a variety of DNN models and tasks, including the ResNet-50, PointNet++, GNMT, and graph attention networks. We obtain sparse sub-models with an average of 13.38% weights and 14.97% FLOPs, while the accuracies are as good as their dense counterparts. More-sparse sub-models with 5.38% weights and 4.47% of FLOPs, which are subsets of the less-sparse ones, can be obtained with only 3.25% relative accuracy loss.

View paper on

Share this with someone who'll enjoy it:

Title:Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting

Paper and Code