Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Julien Reygner

CERMICS, GdR MASCOT-NUM

Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation

Jun 20, 2022

Loucas Pillaud-Vivien, Julien Reygner, Nicolas Flammarion

Figure 1 for Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation

Abstract:Understanding the implicit bias of training algorithms is of crucial importance in order to explain the success of overparametrised neural networks. In this paper, we study the role of the label noise in the training dynamics of a quadratically parametrised model through its continuous time version. We explicitly characterise the solution chosen by the stochastic flow and prove that it implicitly solves a Lasso program. To fully complete our analysis, we provide nonasymptotic convergence guarantees for the dynamics as well as conditions for support recovery. We also give experimental results which support our theoretical claims. Our findings highlight the fact that structured noise can induce better generalisation and help explain the greater performances of stochastic dynamics as observed in practice.

Via

Access Paper or Ask Questions

Reweighting samples under covariate shift using a Wasserstein distance criterion

Oct 19, 2020

Julien Reygner, Adrien Touboul

Figure 1 for Reweighting samples under covariate shift using a Wasserstein distance criterion

Figure 2 for Reweighting samples under covariate shift using a Wasserstein distance criterion

Figure 3 for Reweighting samples under covariate shift using a Wasserstein distance criterion

Figure 4 for Reweighting samples under covariate shift using a Wasserstein distance criterion

Abstract:Considering two random variables with different laws to which we only have access through finite size iid samples, we address how to reweight the first sample so that its empirical distribution converges towards the true law of the second sample as the size of both samples goes to infinity. We study an optimal reweighting that minimizes the Wasserstein distance between the empirical measures of the two samples, and leads to an expression of the weights in terms of Nearest Neighbors. The consistency and some asymptotic convergence rates in terms of expected Wasserstein distance are derived, and do not need the assumption of absolute continuity of one random variable with respect to the other. These results have some application in Uncertainty Quantification for decoupled estimation and in the bound of the generalization error for the Nearest Neighbor Regression under covariate shift.

Via

Access Paper or Ask Questions