Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Léo Gautheron

LHC

Metric Learning from Imbalanced Data

Sep 04, 2019

Léo Gautheron, Emilie Morvant, Amaury Habrard, Marc Sebban

Figure 1 for Metric Learning from Imbalanced Data

Figure 2 for Metric Learning from Imbalanced Data

Figure 3 for Metric Learning from Imbalanced Data

Figure 4 for Metric Learning from Imbalanced Data

Abstract:A key element of any machine learning algorithm is the use of a function that measures the dis/similarity between data points. Given a task, such a function can be optimized with a metric learning algorithm. Although this research field has received a lot of attention during the past decade, very few approaches have focused on learning a metric in an imbalanced scenario where the number of positive examples is much smaller than the negatives. Here, we address this challenging task by designing a new Mahalanobis metric learning algorithm (IML) which deals with class imbalance. The empirical study performed shows the efficiency of IML.

Via

Access Paper or Ask Questions

Learning Landmark-Based Ensembles with Random Fourier Features and Gradient Boosting

Jun 14, 2019

Léo Gautheron, Pascal Germain, Amaury Habrard, Emilie Morvant, Marc Sebban, Valentina Zantedeschi

Figure 1 for Learning Landmark-Based Ensembles with Random Fourier Features and Gradient Boosting

Figure 2 for Learning Landmark-Based Ensembles with Random Fourier Features and Gradient Boosting

Figure 3 for Learning Landmark-Based Ensembles with Random Fourier Features and Gradient Boosting

Figure 4 for Learning Landmark-Based Ensembles with Random Fourier Features and Gradient Boosting

Abstract:We propose a Gradient Boosting algorithm for learning an ensemble of kernel functions adapted to the task at hand. Unlike state-of-the-art Multiple Kernel Learning techniques that make use of a pre-computed dictionary of kernel functions to select from, at each iteration we fit a kernel by approximating it as a weighted sum of Random Fourier Features (RFF) and by optimizing their barycenter. This allows us to obtain a more versatile method, easier to setup and likely to have better performance. Our study builds on a recent result showing one can learn a kernel from RFF by computing the minimum of a PAC-Bayesian bound on the kernel alignment generalization loss, which is obtained efficiently from a closed-form solution. We conduct an experimental analysis to highlight the advantages of our method w.r.t. both Boosting-based and kernel-learning state-of-the-art methods.

Via

Access Paper or Ask Questions

Feature Selection for Unsupervised Domain Adaptation using Optimal Transport

Jun 28, 2018

Léo Gautheron, Ievgen Redko, Carole Lartizien

Figure 1 for Feature Selection for Unsupervised Domain Adaptation using Optimal Transport

Figure 2 for Feature Selection for Unsupervised Domain Adaptation using Optimal Transport

Figure 3 for Feature Selection for Unsupervised Domain Adaptation using Optimal Transport

Figure 4 for Feature Selection for Unsupervised Domain Adaptation using Optimal Transport

Abstract:In this paper, we propose a new feature selection method for unsupervised domain adaptation based on the emerging optimal transportation theory. We build upon a recent theoretical analysis of optimal transport in domain adaptation and show that it can directly suggest a feature selection procedure leveraging the shift between the domains. Based on this, we propose a novel algorithm that aims to sort features by their similarity across the source and target domains, where the order is obtained by analyzing the coupling matrix representing the solution of the proposed optimal transportation problem. We evaluate our method on a well-known benchmark data set and illustrate its capability of selecting correlated features leading to better classification performances. Furthermore, we show that the proposed algorithm can be used as a pre-processing step for existing domain adaptation techniques ensuring an important speed-up in terms of the computational time while maintaining comparable results. Finally, we validate our algorithm on clinical imaging databases for computer-aided diagnosis task with promising results.

Via

Access Paper or Ask Questions