Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Guiying Li

Hardware-Aware DNN Compression for Homogeneous Edge Devices

Jan 25, 2025

Kunlong Zhang, Guiying Li, Ning Lu, Peng Yang, Ke Tang

Figure 1 for Hardware-Aware DNN Compression for Homogeneous Edge Devices

Figure 2 for Hardware-Aware DNN Compression for Homogeneous Edge Devices

Figure 3 for Hardware-Aware DNN Compression for Homogeneous Edge Devices

Figure 4 for Hardware-Aware DNN Compression for Homogeneous Edge Devices

Abstract:Deploying deep neural networks (DNNs) across homogeneous edge devices (the devices with the same SKU labeled by the manufacturer) often assumes identical performance among them. However, once a device model is widely deployed, the performance of each device becomes different after a period of running. This is caused by the differences in user configurations, environmental conditions, manufacturing variances, battery degradation, etc. Existing DNN compression methods have not taken this scenario into consideration and can not guarantee good compression results in all homogeneous edge devices. To address this, we propose Homogeneous-Device Aware Pruning (HDAP), a hardware-aware DNN compression framework explicitly designed for homogeneous edge devices, aiming to achieve optimal average performance of the compressed model across all devices. To deal with the difficulty of time-consuming hardware-aware evaluations for thousands or millions of homogeneous edge devices, HDAP partitions all the devices into several device clusters, which can dramatically reduce the number of devices to evaluate and use the surrogate-based evaluation instead of hardware evaluation in real-time. Experiments on ResNet50 and MobileNetV1 with the ImageNet dataset show that HDAP consistently achieves lower average inference latency compared with state-of-the-art methods, with substantial speedup gains (e.g., 2.86 $\times$ speedup at 1.0G FLOPs for ResNet50) on the homogeneous device clusters. HDAP offers an effective solution for scalable, high-performance DNN deployment methods for homogeneous edge devices.

Via

Access Paper or Ask Questions

Reducing Idleness in Financial Cloud via Multi-objective Evolutionary Reinforcement Learning based Load Balancer

May 05, 2023

Peng Yang, Laoming Zhang, Haifeng Liu, Guiying Li

Abstract:In recent years, various companies started to shift their data services from traditional data centers onto cloud. One of the major motivations is to save operation costs with the aid of cloud elasticity. This paper discusses an emerging need from financial services to reduce idle servers retaining very few user connections, without disconnecting them from the server side. This paper considers this need as a bi-objective online load balancing problem. A neural network based scalable policy is designed to route user requests to varied numbers of servers for elasticity. An evolutionary multi-objective training framework is proposed to optimize the weights of the policy. Not only the new objective of idleness is reduced by over 130% more than traditional industrial solutions, but the original load balancing objective is slightly improved. Extensive simulations help reveal the detailed applicability of the proposed method to the emerging problem of reducing idleness in financial services.

* 17 pages, 13 figures

Via

Access Paper or Ask Questions

Enabling surrogate-assisted evolutionary reinforcement learning via policy embedding

Jan 31, 2023

Lan Tang, Xiaxi Li, Jinyuan Zhang, Guiying Li, Peng Yang, Ke Tang

Figure 1 for Enabling surrogate-assisted evolutionary reinforcement learning via policy embedding

Figure 2 for Enabling surrogate-assisted evolutionary reinforcement learning via policy embedding

Figure 3 for Enabling surrogate-assisted evolutionary reinforcement learning via policy embedding

Figure 4 for Enabling surrogate-assisted evolutionary reinforcement learning via policy embedding

Abstract:Evolutionary Reinforcement Learning (ERL) that applying Evolutionary Algorithms (EAs) to optimize the weight parameters of Deep Neural Network (DNN) based policies has been widely regarded as an alternative to traditional reinforcement learning methods. However, the evaluation of the iteratively generated population usually requires a large amount of computational time and can be prohibitively expensive, which may potentially restrict the applicability of ERL. Surrogate is often used to reduce the computational burden of evaluation in EAs. Unfortunately, in ERL, each individual of policy usually represents millions of weights parameters of DNN. This high-dimensional representation of policy has introduced a great challenge to the application of surrogates into ERL to speed up training. This paper proposes a PE-SAERL Framework to at the first time enable surrogate-assisted evolutionary reinforcement learning via policy embedding (PE). Empirical results on 5 Atari games show that the proposed method can perform more efficiently than the four state-of-the-art algorithms. The training process is accelerated up to 7x on tested games, comparing to its counterpart without the surrogate and PE.

* This paper is submitted to bicta-2022

Via

Access Paper or Ask Questions

Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions

Mar 10, 2019

Yunwen Lei, Ting Hu, Guiying Li, Ke Tang

Abstract:Stochastic gradient descent (SGD) is a popular and efficient method with wide applications in training deep neural nets and other nonconvex models. While the behavior of SGD is well understood in the convex learning setting, the existing theoretical results for SGD applied to nonconvex objective functions are far from mature. For example, existing results require to impose a nontrivial assumption on the uniform boundedness of gradients for all iterates encountered in the learning process, which is hard to verify in practical implementations. In this paper, we establish a rigorous theoretical foundation for SGD in nonconvex learning by showing that this boundedness assumption can be removed without affecting convergence rates. In particular, we establish sufficient conditions for almost sure convergence as well as optimal convergence rates for SGD applied to both general nonconvex objective functions and gradient-dominated objective functions. A linear convergence is further derived in the case with zero variances.

Via

Access Paper or Ask Questions

Relief R-CNN : Utilizing Convolutional Features for Fast Object Detection

Apr 26, 2017

Guiying Li, Junlong Liu, Chunhui Jiang, Liangpeng Zhang, Minlong Lin, Ke Tang

Figure 1 for Relief R-CNN : Utilizing Convolutional Features for Fast Object Detection

Figure 2 for Relief R-CNN : Utilizing Convolutional Features for Fast Object Detection

Figure 3 for Relief R-CNN : Utilizing Convolutional Features for Fast Object Detection

Abstract:R-CNN style methods are sorts of the state-of-the-art object detection methods, which consist of region proposal generation and deep CNN classification. However, the proposal generation phase in this paradigm is usually time consuming, which would slow down the whole detection time in testing. This paper suggests that the value discrepancies among features in deep convolutional feature maps contain plenty of useful spatial information, and proposes a simple approach to extract the information for fast region proposal generation in testing. The proposed method, namely Relief R-CNN (R2-CNN), adopts a novel region proposal generator in a trained R-CNN style model. The new generator directly generates proposals from convolutional features by some simple rules, thus resulting in a much faster proposal generation speed and a lower demand of computation resources. Empirical studies show that R2-CNN could achieve the fastest detection speed with comparable accuracy among all the compared algorithms in testing.

* 9 pages, 2 figures, accepted by ISNN 2017

Via

Access Paper or Ask Questions