Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Massih-Reza Amini

AMA

Multi-Label Contrastive Learning : A Comprehensive Study

Nov 27, 2024

Alexandre Audibert, Aurélien Gauffre, Massih-Reza Amini

Figure 1 for Multi-Label Contrastive Learning : A Comprehensive Study

Figure 2 for Multi-Label Contrastive Learning : A Comprehensive Study

Figure 3 for Multi-Label Contrastive Learning : A Comprehensive Study

Figure 4 for Multi-Label Contrastive Learning : A Comprehensive Study

Abstract:Multi-label classification, which involves assigning multiple labels to a single input, has emerged as a key area in both research and industry due to its wide-ranging applications. Designing effective loss functions is crucial for optimizing deep neural networks for this task, as they significantly influence model performance and efficiency. Traditional loss functions, which often maximize likelihood under the assumption of label independence, may struggle to capture complex label relationships. Recent research has turned to supervised contrastive learning, a method that aims to create a structured representation space by bringing similar instances closer together and pushing dissimilar ones apart. Although contrastive learning offers a promising approach, applying it to multi-label classification presents unique challenges, particularly in managing label interactions and data structure. In this paper, we conduct an in-depth study of contrastive learning loss for multi-label classification across diverse settings. These include datasets with both small and large numbers of labels, datasets with varying amounts of training data, and applications in both computer vision and natural language processing. Our empirical results indicate that the promising outcomes of contrastive learning are attributable not only to the consideration of label interactions but also to the robust optimization scheme of the contrastive loss. Furthermore, while the supervised contrastive loss function faces challenges with datasets containing a small number of labels and ranking-based metrics, it demonstrates excellent performance, particularly in terms of Macro-F1, on datasets with a large number of labels.

* 28 pages, 1 figure

Via

Access Paper or Ask Questions

A Unified Contrastive Loss for Self-Training

Sep 11, 2024

Aurelien Gauffre, Julien Horvat, Massih-Reza Amini

Abstract:Self-training methods have proven to be effective in exploiting abundant unlabeled data in semi-supervised learning, particularly when labeled data is scarce. While many of these approaches rely on a cross-entropy loss function (CE), recent advances have shown that the supervised contrastive loss function (SupCon) can be more effective. Additionally, unsupervised contrastive learning approaches have also been shown to capture high quality data representations in the unsupervised setting. To benefit from these advantages in a semi-supervised setting, we propose a general framework to enhance self-training methods, which replaces all instances of CE losses with a unique contrastive loss. By using class prototypes, which are a set of class-wise trainable parameters, we recover the probability distributions of the CE setting and show a theoretical equivalence with it. Our framework, when applied to popular self-training methods, results in significant performance improvements across three different datasets with a limited number of labeled data. Additionally, we demonstrate further improvements in convergence speed, transfer ability, and hyperparameter stability. The code is available at \url{https://github.com/AurelienGauffre/semisupcon/}.

* Machine Learning and Knowledge Discovery in Databases. Research Track and Demo Track. ECML PKDD 2024 vol 14948

Via

Access Paper or Ask Questions

Unified Framework for Neural Network Compression via Decomposition and Optimal Rank Selection

Sep 05, 2024

Ali Aghababaei-Harandi, Massih-Reza Amini

Figure 1 for Unified Framework for Neural Network Compression via Decomposition and Optimal Rank Selection

Figure 2 for Unified Framework for Neural Network Compression via Decomposition and Optimal Rank Selection

Figure 3 for Unified Framework for Neural Network Compression via Decomposition and Optimal Rank Selection

Figure 4 for Unified Framework for Neural Network Compression via Decomposition and Optimal Rank Selection

Abstract:Despite their high accuracy, complex neural networks demand significant computational resources, posing challenges for deployment on resource-constrained devices such as mobile phones and embedded systems. Compression algorithms have been developed to address these challenges by reducing model size and computational demands while maintaining accuracy. Among these approaches, factorization methods based on tensor decomposition are theoretically sound and effective. However, they face difficulties in selecting the appropriate rank for decomposition. This paper tackles this issue by presenting a unified framework that simultaneously applies decomposition and optimal rank selection, employing a composite compression loss within defined rank constraints. Our approach includes an automatic rank search in a continuous space, efficiently identifying optimal rank configurations without the use of training data, making it computationally efficient. Combined with a subsequent fine-tuning step, our approach maintains the performance of highly compressed models on par with their original counterparts. Using various benchmark datasets, we demonstrate the efficacy of our method through a comprehensive analysis.

Via

Access Paper or Ask Questions

Classification Tree-based Active Learning: A Wrapper Approach

Apr 15, 2024

Ashna Jose, Emilie Devijver, Massih-Reza Amini, Noel Jakse, Roberta Poloni

Abstract:Supervised machine learning often requires large training sets to train accurate models, yet obtaining large amounts of labeled data is not always feasible. Hence, it becomes crucial to explore active learning methods for reducing the size of training sets while maintaining high accuracy. The aim is to select the optimal subset of data for labeling from an initial unlabeled set, ensuring precise prediction of outcomes. However, conventional active learning approaches are comparable to classical random sampling. This paper proposes a wrapper active learning method for classification, organizing the sampling process into a tree structure, that improves state-of-the-art algorithms. A classification tree constructed on an initial set of labeled samples is considered to decompose the space into low-entropy regions. Input-space based criteria are used thereafter to sub-sample from these regions, the total number of points to be labeled being decomposed into each region. This adaptation proves to be a significant enhancement over existing active learning methods. Through experiments conducted on various benchmark data sets, the paper demonstrates the efficacy of the proposed framework by being effective in constructing accurate classification models, even when provided with a severely restricted labeled data set.

Via

Access Paper or Ask Questions

Exploring Contrastive Learning for Long-Tailed Multi-Label Text Classification

Apr 12, 2024

Alexandre Audibert, Aurélien Gauffre, Massih-Reza Amini

Abstract:Learning an effective representation in multi-label text classification (MLTC) is a significant challenge in NLP. This challenge arises from the inherent complexity of the task, which is shaped by two key factors: the intricate connections between labels and the widespread long-tailed distribution of the data. To overcome this issue, one potential approach involves integrating supervised contrastive learning with classical supervised loss functions. Although contrastive learning has shown remarkable performance in multi-class classification, its impact in the multi-label framework has not been thoroughly investigated. In this paper, we conduct an in-depth study of supervised contrastive learning and its influence on representation in MLTC context. We emphasize the importance of considering long-tailed data distributions to build a robust representation space, which effectively addresses two critical challenges associated with contrastive learning that we identify: the "lack of positives" and the "attraction-repulsion imbalance". Building on this insight, we introduce a novel contrastive loss function for MLTC. It attains Micro-F1 scores that either match or surpass those obtained with other frequently employed loss functions, and demonstrates a significant improvement in Macro-F1 scores across three multi-label datasets.

* 14 pages, 2 figures

Via

Access Paper or Ask Questions

Pool-Based Active Learning with Proper Topological Regions

Oct 02, 2023

Lies Hadjadj, Emilie Devijver, Remi Molinier, Massih-Reza Amini

Abstract:Machine learning methods usually rely on large sample size to have good performance, while it is difficult to provide labeled set in many applications. Pool-based active learning methods are there to detect, among a set of unlabeled data, the ones that are the most relevant for the training. We propose in this paper a meta-approach for pool-based active learning strategies in the context of multi-class classification tasks based on Proper Topological Regions. PTR, based on topological data analysis (TDA), are relevant regions used to sample cold-start points or within the active learning scheme. The proposed method is illustrated empirically on various benchmark datasets, being competitive to the classical methods from the literature.

Via

Access Paper or Ask Questions

Deep Learning with Partially Labeled Data for Radio Map Reconstruction

Jun 07, 2023

Alkesandra Malkova, Massih-Reza Amini, Benoit Denis, Christophe Villien

Figure 1 for Deep Learning with Partially Labeled Data for Radio Map Reconstruction

Figure 2 for Deep Learning with Partially Labeled Data for Radio Map Reconstruction

Figure 3 for Deep Learning with Partially Labeled Data for Radio Map Reconstruction

Figure 4 for Deep Learning with Partially Labeled Data for Radio Map Reconstruction

Abstract:In this paper, we address the problem of Received Signal Strength map reconstruction based on location-dependent radio measurements and utilizing side knowledge about the local region; for example, city plan, terrain height, gateway position. Depending on the quantity of such prior side information, we employ Neural Architecture Search to find an optimized Neural Network model with the best architecture for each of the supposed settings. We demonstrate that using additional side information enhances the final accuracy of the Received Signal Strength map reconstruction on three datasets that correspond to three major cities, particularly in sub-areas near the gateways where larger variations of the average received signal power are typically observed.

* 42 pages, 39 figures

Via

Access Paper or Ask Questions

Generalization bounds for learning under graph-dependence: A survey

Mar 25, 2022

Rui-Ray Zhang, Massih-Reza Amini

Figure 1 for Generalization bounds for learning under graph-dependence: A survey

Figure 2 for Generalization bounds for learning under graph-dependence: A survey

Figure 3 for Generalization bounds for learning under graph-dependence: A survey

Figure 4 for Generalization bounds for learning under graph-dependence: A survey

Abstract:Traditional statistical learning theory relies on the assumption that data are identically and independently generated from a given distribution (i.i.d.). The independently distributed assumption, on the other hand, fails to hold in many real applications. In this survey, we consider learning settings in which examples are dependent and their dependence relationship can be characterized by a graph. We collect various graph-dependent concentration bounds, which are then used to derive Rademacher and stability generalization bounds for learning from graph-dependent data. We illustrate this paradigm with three learning tasks and provide some research directions for future work. To the best of our knowledge, this is the first survey on this subject.

Via

Access Paper or Ask Questions

Learning over No-Preferred and Preferred Sequence of Items for Robust Recommendation (Extended Abstract)

Feb 26, 2022

Aleksandra Burashnikova, Yury Maximov, Marianne Clausel, Charlotte Laclau, Franck Iutzeler, Massih-Reza Amini

Figure 1 for Learning over No-Preferred and Preferred Sequence of Items for Robust Recommendation (Extended Abstract)

Figure 2 for Learning over No-Preferred and Preferred Sequence of Items for Robust Recommendation (Extended Abstract)

Figure 3 for Learning over No-Preferred and Preferred Sequence of Items for Robust Recommendation (Extended Abstract)

Figure 4 for Learning over No-Preferred and Preferred Sequence of Items for Robust Recommendation (Extended Abstract)

Abstract:This paper is an extended version of [Burashnikova et al., 2021, arXiv: 2012.06910], where we proposed a theoretically supported sequential strategy for training a large-scale Recommender System (RS) over implicit feedback, mainly in the form of clicks. The proposed approach consists in minimizing pairwise ranking loss over blocks of consecutive items constituted by a sequence of non-clicked items followed by a clicked one for each user. We present two variants of this strategy where model parameters are updated using either the momentum method or a gradient-based approach. To prevent updating the parameters for an abnormally high number of clicks over some targeted items (mainly due to bots), we introduce an upper and a lower threshold on the number of updates for each user. These thresholds are estimated over the distribution of the number of blocks in the training set. They affect the decision of RS by shifting the distribution of items that are shown to the users. Furthermore, we provide a convergence analysis of both algorithms and demonstrate their practical efficiency over six large-scale collections with respect to various ranking measures.

* 7 pages, 2 tables; extended abstract accepted to IJCAI 2022. arXiv admin note: substantial text overlap with arXiv:2012.06910, arXiv:1902.08495

Via

Access Paper or Ask Questions

Self-Training: A Survey

Feb 24, 2022

Massih-Reza Amini, Vasilii Feofanov, Loic Pauletto, Emilie Devijver, Yury Maximov

Abstract:In recent years, semi-supervised algorithms have received a lot of interest in both academia and industry. Among the existing techniques, self-training methods have arguably received more attention in the last few years. These models are designed to search the decision boundary on low density regions without making extra assumptions on the data distribution, and use the unsigned output score of a learned classifier, or its margin, as an indicator of confidence. The working principle of self-training algorithms is to learn a classifier iteratively by assigning pseudo-labels to the set of unlabeled training samples with a margin greater than a certain threshold. The pseudo-labeled examples are then used to enrich the labeled training data and train a new classifier in conjunction with the labeled training set. We present self-training methods for binary and multiclass classification and their variants which were recently developed using Neural Networks. Finally, we discuss our ideas for future research in self-training. To the best of our knowledge, this is the first thorough and complete survey on this subject.

* 16 pages

Via

Access Paper or Ask Questions