Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fedor Noskov

Dimension-free bounds in high-dimensional linear regression via error-in-operator approach

Feb 21, 2025

Fedor Noskov, Nikita Puchkin, Vladimir Spokoiny

Abstract:We consider a problem of high-dimensional linear regression with random design. We suggest a novel approach referred to as error-in-operator which does not estimate the design covariance $\Sigma$ directly but incorporates it into empirical risk minimization. We provide an expansion of the excess prediction risk and derive non-asymptotic dimension-free bounds on the leading term and the remainder. This helps us to show that auxiliary variables do not increase the effective dimension of the problem, provided that parameters of the procedure are tuned properly. We also discuss computational aspects of our method and illustrate its performance with numerical experiments.

* 100 pages

Via

Access Paper or Ask Questions

Efficient Conformal Prediction under Data Heterogeneity

Dec 25, 2023

Vincent Plassier, Nikita Kotelevskii, Aleksandr Rubashevskii, Fedor Noskov, Maksim Velikanov, Alexander Fishkov, Samuel Horvath, Martin Takac, Eric Moulines, Maxim Panov

Figure 1 for Efficient Conformal Prediction under Data Heterogeneity

Figure 2 for Efficient Conformal Prediction under Data Heterogeneity

Figure 3 for Efficient Conformal Prediction under Data Heterogeneity

Figure 4 for Efficient Conformal Prediction under Data Heterogeneity

Abstract:Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification, which is crucial for ensuring the reliability of predictions. However, common CP methods heavily rely on data exchangeability, a condition often violated in practice. Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples. This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions. We illustrate the general theory with applications to the challenging setting of federated learning under data heterogeneity between agents. Our method allows constructing provably valid personalized prediction sets for agents in a fully federated way. The effectiveness of the proposed method is demonstrated in a series of experiments on real-world datasets.

* 28 pages

Via

Access Paper or Ask Questions

Selective Nonparametric Regression via Testing

Sep 28, 2023

Fedor Noskov, Alexander Fishkov, Maxim Panov

Abstract:Prediction with the possibility of abstention (or selective prediction) is an important problem for error-critical machine learning applications. While well-studied in the classification setup, selective approaches to regression are much less developed. In this work, we consider the nonparametric heteroskedastic regression problem and develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point. Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor. We prove non-asymptotic bounds on the risk of the resulting estimator and show the existence of several different convergence regimes. Theoretical analysis is illustrated with a series of experiments on simulated and real-world data.

Via

Access Paper or Ask Questions

Optimal Estimation in Mixed-Membership Stochastic Block Models

Jul 26, 2023

Fedor Noskov, Maxim Panov

Figure 1 for Optimal Estimation in Mixed-Membership Stochastic Block Models

Figure 2 for Optimal Estimation in Mixed-Membership Stochastic Block Models

Figure 3 for Optimal Estimation in Mixed-Membership Stochastic Block Models

Figure 4 for Optimal Estimation in Mixed-Membership Stochastic Block Models

Abstract:Community detection is one of the most critical problems in modern network science. Its applications can be found in various fields, from protein modeling to social network analysis. Recently, many papers appeared studying the problem of overlapping community detection, where each node of a network may belong to several communities. In this work, we consider Mixed-Membership Stochastic Block Model (MMSB) first proposed by Airoldi et al. (2008). MMSB provides quite a general setting for modeling overlapping community structure in graphs. The central question of this paper is to reconstruct relations between communities given an observed network. We compare different approaches and establish the minimax lower bound on the estimation error. Then, we propose a new estimator that matches this lower bound. Theoretical results are proved under fairly general conditions on the considered model. Finally, we illustrate the theory in a series of experiments.

Via

Access Paper or Ask Questions

NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks

Feb 07, 2022

Nikita Kotelevskii, Aleksandr Artemenkov, Kirill Fedyanin, Fedor Noskov, Alexander Fishkov, Aleksandr Petiushko, Maxim Panov

Figure 1 for NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks

Figure 2 for NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks

Figure 3 for NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks

Figure 4 for NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks

Abstract:This paper proposes a fast and scalable method for uncertainty quantification of machine learning models' predictions. First, we show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution. Importantly, the approach allows to disentangle explicitly aleatoric and epistemic uncertainties. The resulting method works directly in the feature space. However, one can apply it to any neural network by considering an embedding of the data induced by the network. We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets, such as MNIST, SVHN, CIFAR-100 and several versions of ImageNet.

Via

Access Paper or Ask Questions