Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nuria Molner

Combining Relevance and Magnitude for Resource-Aware DNN Pruning

May 21, 2024

Carla Fabiana Chiasserini, Francesco Malandrino, Nuria Molner, Zhiqiang Zhao

Abstract:Pruning neural networks, i.e., removing some of their parameters whilst retaining their accuracy, is one of the main ways to reduce the latency of a machine learning pipeline, especially in resource- and/or bandwidth-constrained scenarios. In this context, the pruning technique, i.e., how to choose the parameters to remove, is critical to the system performance. In this paper, we propose a novel pruning approach, called FlexRel and predicated upon combining training-time and inference-time information, namely, parameter magnitude and relevance, in order to improve the resulting accuracy whilst saving both computational resources and bandwidth. Our performance evaluation shows that FlexRel is able to achieve higher pruning factors, saving over 35% bandwidth for typical accuracy targets.

Via

Access Paper or Ask Questions

Choose, not Hoard: Information-to-Model Matching for Artificial Intelligence in O-RAN

Aug 01, 2022

Jorge Martín-Pérez, Nuria Molner, Francesco Malandrino, Carlos Jesús Bernardos, Antonio de la Oliva, David Gomez-Barquero

Figure 1 for Choose, not Hoard: Information-to-Model Matching for Artificial Intelligence in O-RAN

Figure 2 for Choose, not Hoard: Information-to-Model Matching for Artificial Intelligence in O-RAN

Figure 3 for Choose, not Hoard: Information-to-Model Matching for Artificial Intelligence in O-RAN

Figure 4 for Choose, not Hoard: Information-to-Model Matching for Artificial Intelligence in O-RAN

Abstract:Open Radio Access Network (O-RAN) is an emerging paradigm, whereby virtualized network infrastructure elements from different vendors communicate via open, standardized interfaces. A key element therein is the RAN Intelligent Controller (RIC), an Artificial Intelligence (AI)-based controller. Traditionally, all data available in the network has been used to train a single AI model to use at the RIC. In this paper we introduce, discuss, and evaluate the creation of multiple AI model instances at different RICs, leveraging information from some (or all) locations for their training. This brings about a flexible relationship between gNBs, the AI models used to control them, and the data such models are trained with. Experiments with real-world traces show how using multiple AI model instances that choose training data from specific locations improve the performance of traditional approaches.

Via

Access Paper or Ask Questions

Network Support for High-performance Distributed Machine Learning

Feb 05, 2021

Francesco Malandrino, Carla Fabiana Chiasserini, Nuria Molner, Antonio De La Oliva

Figure 1 for Network Support for High-performance Distributed Machine Learning

Figure 2 for Network Support for High-performance Distributed Machine Learning

Figure 3 for Network Support for High-performance Distributed Machine Learning

Figure 4 for Network Support for High-performance Distributed Machine Learning

Abstract:The traditional approach to distributed machine learning is to adapt learning algorithms to the network, e.g., reducing updates to curb overhead. Networks based on intelligent edge, instead, make it possible to follow the opposite approach, i.e., to define the logical network topology em around the learning task to perform, so as to meet the desired learning performance. In this paper, we propose a system model that captures such aspects in the context of supervised machine learning, accounting for both learning nodes (that perform computations) and information nodes (that provide data). We then formulate the problem of selecting (i) which learning and information nodes should cooperate to complete the learning task, and (ii) the number of iterations to perform, in order to minimize the learning cost while meeting the target prediction error and execution time. After proving important properties of the above problem, we devise an algorithm, named DoubleClimb, that can find a 1+1/|I|-competitive solution (with I being the set of information nodes), with cubic worst-case complexity. Our performance evaluation, leveraging a real-world network topology and considering both classification and regression tasks, also shows that DoubleClimb closely matches the optimum, outperforming state-of-the-art alternatives.

Via

Access Paper or Ask Questions