Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jordan Massiah

PARS: Pseudo-Label Aware Robust Sample Selection for Learning with Noisy Labels

Jan 26, 2022

Arushi Goel, Yunlong Jiao, Jordan Massiah

Figure 1 for PARS: Pseudo-Label Aware Robust Sample Selection for Learning with Noisy Labels

Figure 2 for PARS: Pseudo-Label Aware Robust Sample Selection for Learning with Noisy Labels

Figure 3 for PARS: Pseudo-Label Aware Robust Sample Selection for Learning with Noisy Labels

Figure 4 for PARS: Pseudo-Label Aware Robust Sample Selection for Learning with Noisy Labels

Abstract:Acquiring accurate labels on large-scale datasets is both time consuming and expensive. To reduce the dependency of deep learning models on learning from clean labeled data, several recent research efforts are focused on learning with noisy labels. These methods typically fall into three design categories to learn a noise robust model: sample selection approaches, noise robust loss functions, or label correction methods. In this paper, we propose PARS: Pseudo-Label Aware Robust Sample Selection, a hybrid approach that combines the best from all three worlds in a joint-training framework to achieve robustness to noisy labels. Specifically, PARS exploits all training samples using both the raw/noisy labels and estimated/refurbished pseudo-labels via self-training, divides samples into an ambiguous and a noisy subset via loss analysis, and designs label-dependent noise-aware loss functions for both sets of filtered labels. Results show that PARS significantly outperforms the state of the art on extensive studies on the noisy CIFAR-10 and CIFAR-100 datasets, particularly on challenging high-noise and low-resource settings. In particular, PARS achieved an absolute 12% improvement in test accuracy on the CIFAR-100 dataset with 90% symmetric label noise, and an absolute 27% improvement in test accuracy when only 1/5 of the noisy labels are available during training as an additional restriction. On a real-world noisy dataset, Clothing1M, PARS achieves competitive results to the state of the art.

* 16 pages

Via

Access Paper or Ask Questions

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Sep 28, 2021

Fangyu Liu, Yunlong Jiao, Jordan Massiah, Emine Yilmaz, Serhii Havrylov

Figure 1 for Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Figure 2 for Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Figure 3 for Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Figure 4 for Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

Abstract:In NLP, a large volume of tasks involve pairwise comparison between two sequences (e.g. sentence similarity and paraphrase identification). Predominantly, two formulations are used for sentence-pair tasks: bi-encoders and cross-encoders. Bi-encoders produce fixed-dimensional sentence representations and are computationally efficient, however, they usually underperform cross-encoders. Cross-encoders can leverage their attention heads to exploit inter-sentence interactions for better performance but they require task fine-tuning and are computationally more expensive. In this paper, we present a completely unsupervised sentence representation model termed as Trans-Encoder that combines the two learning paradigms into an iterative joint framework to simultaneously learn enhanced bi- and cross-encoders. Specifically, on top of a pre-trained Language Model (PLM), we start with converting it to an unsupervised bi-encoder, and then alternate between the bi- and cross-encoder task formulations. In each alternation, one task formulation will produce pseudo-labels which are used as learning signals for the other task formulation. We then propose an extension to conduct such self-distillation approach on multiple PLMs in parallel and use the average of their pseudo-labels for mutual-distillation. Trans-Encoder creates, to the best of our knowledge, the first completely unsupervised cross-encoder and also a state-of-the-art unsupervised bi-encoder for sentence similarity. Both the bi-encoder and cross-encoder formulations of Trans-Encoder outperform recently proposed state-of-the-art unsupervised sentence encoders such as Mirror-BERT and SimCSE by up to 5% on the sentence similarity benchmarks.

* 16 pages

Via

Access Paper or Ask Questions