Abstract:Deep neural networks suffer from storing millions and billions of weights in memory post-training, making challenging memory-intensive models to deploy on embedded devices. The weight-sharing technique is one of the popular compression approaches that use fewer weight values and share across specific connections in the network. In this paper, we propose a multi-objective evolutionary algorithm (MOEA) based compression framework independent of neural network architecture, dimension, task, and dataset. We use uniformly sized bins to quantize network weights into a single codebook (lookup table) for efficient weight representation. Using MOEA, we search for Pareto optimal $k$ bins by optimizing two objectives. Then, we apply the iterative merge technique to non-dominated Pareto frontier solutions by combining neighboring bins without degrading performance to decrease the number of bins and increase the compression ratio. Our approach is model- and layer-independent, meaning the weights are mixed in the clusters from any layer, and the uniform quantization method used in this work has $O(N)$ complexity instead of non-uniform quantization methods such as k-means with $O(Nkt)$ complexity. In addition, we use the center of clusters as the shared weight values instead of retraining shared weights, which is computationally expensive. The advantage of using evolutionary multi-objective optimization is that it can obtain non-dominated Pareto frontier solutions with respect to performance and shared weights. The experimental results show that we can reduce the neural network memory by $13.72 \sim14.98 \times$ on CIFAR-10, $11.61 \sim 12.99\times$ on CIFAR-100, and $7.44 \sim 8.58\times$ on ImageNet showcasing the effectiveness of the proposed deep neural network compression framework.
Abstract:Deep learning is mainly based on utilizing gradient-based optimization for training Deep Neural Network (DNN) models. Although robust and widely used, gradient-based optimization algorithms are prone to getting stuck in local minima. In this modern deep learning era, the state-of-the-art DNN models have millions and billions of parameters, including weights and biases, making them huge-scale optimization problems in terms of search space. Tuning a huge number of parameters is a challenging task that causes vanishing/exploding gradients and overfitting; likewise, utilized loss functions do not exactly represent our targeted performance metrics. A practical solution to exploring large and complex solution space is meta-heuristic algorithms. Since DNNs exceed thousands and millions of parameters, even robust meta-heuristic algorithms, such as Differential Evolution, struggle to efficiently explore and converge in such huge-dimensional search spaces, leading to very slow convergence and high memory demand. To tackle the mentioned curse of dimensionality, the concept of blocking was recently proposed as a technique that reduces the search space dimensions by grouping them into blocks. In this study, we aim to introduce Histogram-based Blocking Differential Evolution (HBDE), a novel approach that hybridizes gradient-based and gradient-free algorithms to optimize parameters. Experimental results demonstrated that the HBDE could reduce the parameters in the ResNet-18 model from 11M to 3K during the training/optimizing phase by metaheuristics, namely, the proposed HBDE, which outperforms baseline gradient-based and parent gradient-free DE algorithms evaluated on CIFAR-10 and CIFAR-100 datasets showcasing its effectiveness with reduced computational demands for the very first time.