Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dominik Ślęzak

Robust Assignment of Labels for Active Learning with Sparse and Noisy Annotations

Jul 25, 2023

Daniel Kałuża, Andrzej Janusz, Dominik Ślęzak

Figure 1 for Robust Assignment of Labels for Active Learning with Sparse and Noisy Annotations

Figure 2 for Robust Assignment of Labels for Active Learning with Sparse and Noisy Annotations

Figure 3 for Robust Assignment of Labels for Active Learning with Sparse and Noisy Annotations

Figure 4 for Robust Assignment of Labels for Active Learning with Sparse and Noisy Annotations

Abstract:Supervised classification algorithms are used to solve a growing number of real-life problems around the globe. Their performance is strictly connected with the quality of labels used in training. Unfortunately, acquiring good-quality annotations for many tasks is infeasible or too expensive to be done in practice. To tackle this challenge, active learning algorithms are commonly employed to select only the most relevant data for labeling. However, this is possible only when the quality and quantity of labels acquired from experts are sufficient. Unfortunately, in many applications, a trade-off between annotating individual samples by multiple annotators to increase label quality vs. annotating new samples to increase the total number of labeled instances is necessary. In this paper, we address the issue of faulty data annotations in the context of active learning. In particular, we propose two novel annotation unification algorithms that utilize unlabeled parts of the sample space. The proposed methods require little to no intersection between samples annotated by different experts. Our experiments on four public datasets indicate the robustness and superiority of the proposed methods in both, the estimation of the annotator's reliability, and the assignment of actual labels, against the state-of-the-art algorithms and the simple majority voting.

* 9 pages, 2 figures, 4 tables Extended version of paper accepted for 26th European Conference on Artificial Intelligence ECAI 2023 with appendices

Via

Access Paper or Ask Questions

Schema matching using Gaussian mixture models with Wasserstein distance

Nov 28, 2021

Mateusz Przyborowski, Mateusz Pabiś, Andrzej Janusz, Dominik Ślęzak

Figure 1 for Schema matching using Gaussian mixture models with Wasserstein distance

Figure 2 for Schema matching using Gaussian mixture models with Wasserstein distance

Abstract:Gaussian mixture models find their place as a powerful tool, mostly in the clustering problem, but with proper preparation also in feature extraction, pattern recognition, image segmentation and in general machine learning. When faced with the problem of schema matching, different mixture models computed on different pieces of data can maintain crucial information about the structure of the dataset. In order to measure or compare results from mixture models, the Wasserstein distance can be very useful, however it is not easy to calculate for mixture distributions. In this paper we derive one of possible approximations for the Wasserstein distance between Gaussian mixture models and reduce it to linear problem. Furthermore, application examples concerning real world data are shown.

Via

Access Paper or Ask Questions