Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lies Hadjadj

Pool-Based Active Learning with Proper Topological Regions

Oct 02, 2023

Lies Hadjadj, Emilie Devijver, Remi Molinier, Massih-Reza Amini

Abstract:Machine learning methods usually rely on large sample size to have good performance, while it is difficult to provide labeled set in many applications. Pool-based active learning methods are there to detect, among a set of unlabeled data, the ones that are the most relevant for the training. We propose in this paper a meta-approach for pool-based active learning strategies in the context of multi-class classification tasks based on Proper Topological Regions. PTR, based on topological data analysis (TDA), are relevant regions used to sample cold-start points or within the active learning scheme. The proposed method is illustrated empirically on various benchmark datasets, being competitive to the classical methods from the literature.

Via

Access Paper or Ask Questions

Self-Training of Halfspaces with Generalization Guarantees under Massart Mislabeling Noise Model

Dec 02, 2021

Lies Hadjadj, Massih-Reza Amini, Sana Louhichi, Alexis Deschamps

Figure 1 for Self-Training of Halfspaces with Generalization Guarantees under Massart Mislabeling Noise Model

Figure 2 for Self-Training of Halfspaces with Generalization Guarantees under Massart Mislabeling Noise Model

Abstract:We investigate the generalization properties of a self-training algorithm with halfspaces. The approach learns a list of halfspaces iteratively from labeled and unlabeled training data, in which each iteration consists of two steps: exploration and pruning. In the exploration phase, the halfspace is found sequentially by maximizing the unsigned-margin among unlabeled examples and then assigning pseudo-labels to those that have a distance higher than the current threshold. The pseudo-labeled examples are then added to the training set, and a new classifier is learned. This process is repeated until no more unlabeled examples remain for pseudo-labeling. In the pruning phase, pseudo-labeled samples that have a distance to the last halfspace greater than the associated unsigned-margin are then discarded. We prove that the misclassification error of the resulting sequence of classifiers is bounded and show that the resulting semi-supervised approach never degrades performance compared to the classifier learned using only the initial labeled training set. Experiments carried out on a variety of benchmarks demonstrate the efficiency of the proposed approach compared to state-of-the-art methods.

Via

Access Paper or Ask Questions