Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ronald Phlypo

GIPSA-lab

Meta-Sparsity: Learning Optimal Sparse Structures in Multi-task Networks through Meta-learning

Jan 21, 2025

Richa Upadhyay, Ronald Phlypo, Rajkumar Saini, Marcus Liwicki

Abstract:This paper presents meta-sparsity, a framework for learning model sparsity, basically learning the parameter that controls the degree of sparsity, that allows deep neural networks (DNNs) to inherently generate optimal sparse shared structures in multi-task learning (MTL) setting. This proposed approach enables the dynamic learning of sparsity patterns across a variety of tasks, unlike traditional sparsity methods that rely heavily on manual hyperparameter tuning. Inspired by Model Agnostic Meta-Learning (MAML), the emphasis is on learning shared and optimally sparse parameters in multi-task scenarios by implementing a penalty-based, channel-wise structured sparsity during the meta-training phase. This method improves the model's efficacy by removing unnecessary parameters and enhances its ability to handle both seen and previously unseen tasks. The effectiveness of meta-sparsity is rigorously evaluated by extensive experiments on two datasets, NYU-v2 and CelebAMask-HQ, covering a broad spectrum of tasks ranging from pixel-level to image-level predictions. The results show that the proposed approach performs well across many tasks, indicating its potential as a versatile tool for creating efficient and adaptable sparse neural networks. This work, therefore, presents an approach towards learning sparsity, contributing to the efforts in the field of sparse neural networks and suggesting new directions for research towards parsimonious models.

Via

Access Paper or Ask Questions

Giving each task what it needs -- leveraging structured sparsity for tailored multi-task learning

Jun 05, 2024

Richa Upadhyay, Ronald Phlypo, Rajkumar Saini, Marcus Liwicki

Figure 1 for Giving each task what it needs -- leveraging structured sparsity for tailored multi-task learning

Figure 2 for Giving each task what it needs -- leveraging structured sparsity for tailored multi-task learning

Figure 3 for Giving each task what it needs -- leveraging structured sparsity for tailored multi-task learning

Figure 4 for Giving each task what it needs -- leveraging structured sparsity for tailored multi-task learning

Abstract:Every task demands distinct feature representations, ranging from low-level to high-level attributes, so it is vital to address the specific needs of each task, especially in the Multi-task Learning (MTL) framework. This work, therefore, introduces Layer-Optimized Multi-Task (LOMT) models that utilize structured sparsity to refine feature selection for individual tasks and enhance the performance of all tasks in a multi-task scenario. Structured or group sparsity systematically eliminates parameters from trivial channels and, eventually, entire layers within a convolution neural network during training. Consequently, the remaining layers provide the most optimal features for a given task. In this two-step approach, we subsequently leverage this sparsity-induced optimal layer information to build the LOMT models by connecting task-specific decoders to these strategically identified layers, deviating from conventional approaches that uniformly connect decoders at the end of the network. This tailored architecture optimizes the network, focusing on essential features while reducing redundancy. We validate the efficacy of the proposed approach on two datasets, ie NYU-v2 and CelebAMask-HD datasets, for multiple heterogeneous tasks. A detailed performance analysis of the LOMT models, in contrast to the conventional MTL models, reveals that the LOMT models outperform for most task combinations. The excellent qualitative and quantitative outcomes highlight the effectiveness of employing structured sparsity for optimal layer (or feature) selection.

Via

Access Paper or Ask Questions

Less is More -- Towards parsimonious multi-task models using structured sparsity

Aug 31, 2023

Richa Upadhyay, Ronald Phlypo, Rajkumar Saini, Marcus Liwicki

Abstract:Model sparsification in deep learning promotes simpler, more interpretable models with fewer parameters. This not only reduces the model's memory footprint and computational needs but also shortens inference time. This work focuses on creating sparse models optimized for multiple tasks with fewer parameters. These parsimonious models also possess the potential to match or outperform dense models in terms of performance. In this work, we introduce channel-wise l1/l2 group sparsity in the shared convolutional layers parameters (or weights) of the multi-task learning model. This approach facilitates the removal of extraneous groups i.e., channels (due to l1 regularization) and also imposes a penalty on the weights, further enhancing the learning efficiency for all tasks (due to l2 regularization). We analyzed the results of group sparsity in both single-task and multi-task settings on two widely-used Multi-Task Learning (MTL) datasets: NYU-v2 and CelebAMask-HQ. On both datasets, which consist of three different computer vision tasks each, multi-task models with approximately 70% sparsity outperform their dense equivalents. We also investigate how changing the degree of sparsification influences the model's performance, the overall sparsity percentage, the patterns of sparsity, and the inference time.

* Under review

Via

Access Paper or Ask Questions

Multi-Task Meta Learning: learn how to adapt to unseen tasks

Oct 13, 2022

Richa Upadhyay, Prakash Chandra Chhipa, Ronald Phlypo, Rajkumar Saini, Marcus Liwicki

Figure 1 for Multi-Task Meta Learning: learn how to adapt to unseen tasks

Figure 2 for Multi-Task Meta Learning: learn how to adapt to unseen tasks

Figure 3 for Multi-Task Meta Learning: learn how to adapt to unseen tasks

Figure 4 for Multi-Task Meta Learning: learn how to adapt to unseen tasks

Abstract:This work aims to integrate two learning paradigms Multi-Task Learning (MTL) and meta learning, to bring together the best of both worlds, i.e., simultaneous learning of multiple tasks, an element of MTL and promptly adapting to new tasks with fewer data, a quality of meta learning. We propose Multi-task Meta Learning (MTML), an approach to enhance MTL compared to single task learning by employing meta learning. The fundamental idea of this work is to train a multi-task model, such that when an unseen task is introduced, it can learn in fewer steps whilst offering a performance at least as good as conventional single task learning on the new task or inclusion within the MTL. By conducting various experiments, we demonstrate this paradigm on two datasets and four tasks: NYU-v2 and the taskonomy dataset for which we perform semantic segmentation, depth estimation, surface normal estimation, and edge detection. MTML achieves state-of-the-art results for most of the tasks, and MTL also performs reasonably well for all tasks compared to single task learning.

Via

Access Paper or Ask Questions

Sharing to learn and learning to share - Fitting together Meta-Learning, Multi-Task Learning, and Transfer Learning : A meta review

Nov 23, 2021

Richa Upadhyay, Ronald Phlypo, Rajkumar Saini, Marcus Liwicki

Figure 1 for Sharing to learn and learning to share - Fitting together Meta-Learning, Multi-Task Learning, and Transfer Learning : A meta review

Figure 2 for Sharing to learn and learning to share - Fitting together Meta-Learning, Multi-Task Learning, and Transfer Learning : A meta review

Figure 3 for Sharing to learn and learning to share - Fitting together Meta-Learning, Multi-Task Learning, and Transfer Learning : A meta review

Figure 4 for Sharing to learn and learning to share - Fitting together Meta-Learning, Multi-Task Learning, and Transfer Learning : A meta review

Abstract:Integrating knowledge across different domains is an essential feature of human learning. Learning paradigms like transfer learning, meta learning, and multi-task learning reflect the human learning process by exploiting the prior knowledge for new tasks, encouraging faster learning and good generalization for new tasks. This article gives a detailed view of these learning paradigms along with a comparative analysis. The weakness of a learning algorithm turns out to be the strength of another, and thereby merging them is a prevalent trait in the literature. This work delivers a literature review of the articles, which fuses two algorithms to accomplish multiple tasks. A global generic learning network, an ensemble of meta learning, transfer learning, and multi-task learning, is also introduced here, along with some open research questions and directions for future research.

* 16 pages, 8 figures

Via

Access Paper or Ask Questions

Mixed-norm Regularization for Brain Decoding

Mar 14, 2014

Rémi Flamary, Nisrine Jrad, Ronald Phlypo, Marco Congedo, Alain Rakotomamonjy

Figure 1 for Mixed-norm Regularization for Brain Decoding

Figure 2 for Mixed-norm Regularization for Brain Decoding

Figure 3 for Mixed-norm Regularization for Brain Decoding

Figure 4 for Mixed-norm Regularization for Brain Decoding

Abstract:This work investigates the use of mixed-norm regularization for sensor selection in Event-Related Potential (ERP) based Brain-Computer Interfaces (BCI). The classification problem is cast as a discriminative optimization framework where sensor selection is induced through the use of mixed-norms. This framework is extended to the multi-task learning situation where several similar classification tasks related to different subjects are learned simultaneously. In this case, multi-task learning helps in leveraging data scarcity issue yielding to more robust classifiers. For this purpose, we have introduced a regularizer that induces both sensor selection and classifier similarities. The different regularization approaches are compared on three ERP datasets showing the interest of mixed-norm regularization in terms of sensor selection. The multi-task approaches are evaluated when a small number of learning examples are available yielding to significant performance improvements especially for subjects performing poorly.

* Computational and Mathematical Methods in Medicine (2014) http://www.hindawi.com/journals/cmmm/

Via

Access Paper or Ask Questions

Independent Vector Analysis: Identification Conditions and Performance Bounds

Mar 29, 2013

Matthew Anderson, Geng-Shen Fu, Ronald Phlypo, Tülay Adalı

Figure 1 for Independent Vector Analysis: Identification Conditions and Performance Bounds

Figure 2 for Independent Vector Analysis: Identification Conditions and Performance Bounds

Figure 3 for Independent Vector Analysis: Identification Conditions and Performance Bounds

Figure 4 for Independent Vector Analysis: Identification Conditions and Performance Bounds

Abstract:Recently, an extension of independent component analysis (ICA) from one to multiple datasets, termed independent vector analysis (IVA), has been the subject of significant research interest. IVA has also been shown to be a generalization of Hotelling's canonical correlation analysis. In this paper, we provide the identification conditions for a general IVA formulation, which accounts for linear, nonlinear, and sample-to-sample dependencies. The identification conditions are a generalization of previous results for ICA and for IVA when samples are independently and identically distributed. Furthermore, a principal aim of IVA is the identification of dependent sources between datasets. Thus, we provide the additional conditions for when the arbitrary ordering of the sources within each dataset is common. Performance bounds in terms of the Cramer-Rao lower bound are also provided for the demixing matrices and interference to source ratio. The performance of two IVA algorithms are compared to the theoretical bounds.

* 14 pages, 5 figures, in review for IEEE Trans. on Signal Processing

Via

Access Paper or Ask Questions