Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wolfgang Polonik

UC Davis

Gaussian and Bootstrap Approximation for Matching-based Average Treatment Effect Estimators

Dec 22, 2024

Zhaoyang Shi, Chinmoy Bhattacharjee, Krishnakumar Balasubramanian, Wolfgang Polonik

Abstract:We establish Gaussian approximation bounds for covariate and rank-matching-based Average Treatment Effect (ATE) estimators. By analyzing these estimators through the lens of stabilization theory, we employ the Malliavin-Stein method to derive our results. Our bounds precisely quantify the impact of key problem parameters, including the number of matches and treatment balance, on the accuracy of the Gaussian approximation. Additionally, we develop multiplier bootstrap procedures to estimate the limiting distribution in a fully data-driven manner, and we leverage the derived Gaussian approximation results to further obtain bootstrap approximation bounds. Our work not only introduces a novel theoretical framework for commonly used ATE estimators, but also provides data-driven methods for constructing non-asymptotically valid confidence intervals.

Via

Access Paper or Ask Questions

Multivariate Gaussian Approximation for Random Forest via Region-based Stabilization

Mar 26, 2024

Zhaoyang Shi, Chinmoy Bhattacharjee, Krishnakumar Balasubramanian, Wolfgang Polonik

Figure 1 for Multivariate Gaussian Approximation for Random Forest via Region-based Stabilization

Abstract:We derive Gaussian approximation bounds for random forest predictions based on a set of training points given by a Poisson process, under fairly mild regularity assumptions on the data generating process. Our approach is based on the key observation that the random forest predictions satisfy a certain geometric property called region-based stabilization. In the process of developing our results for the random forest, we also establish a probabilistic result, which might be of independent interest, on multivariate Gaussian approximation bounds for general functionals of Poisson process that are region-based stabilizing. This general result makes use of the Malliavin-Stein method, and is potentially applicable to various related statistical problems.

Via

Access Paper or Ask Questions

Nonsmooth Nonparametric Regression via Fractional Laplacian Eigenmaps

Feb 22, 2024

Zhaoyang Shi, Krishnakumar Balasubramanian, Wolfgang Polonik

Figure 1 for Nonsmooth Nonparametric Regression via Fractional Laplacian Eigenmaps

Figure 2 for Nonsmooth Nonparametric Regression via Fractional Laplacian Eigenmaps

Figure 3 for Nonsmooth Nonparametric Regression via Fractional Laplacian Eigenmaps

Abstract:We develop nonparametric regression methods for the case when the true regression function is not necessarily smooth. More specifically, our approach is using the fractional Laplacian and is designed to handle the case when the true regression function lies in an $L_2$-fractional Sobolev space with order $s\in (0,1)$. This function class is a Hilbert space lying between the space of square-integrable functions and the first-order Sobolev space consisting of differentiable functions. It contains fractional power functions, piecewise constant or polynomial functions and bump function as canonical examples. For the proposed approach, we prove upper bounds on the in-sample mean-squared estimation error of order $n^{-\frac{2s}{2s+d}}$, where $d$ is the dimension, $s$ is the aforementioned order parameter and $n$ is the number of observations. We also provide preliminary empirical results validating the practical performance of the developed estimators.

Via

Access Paper or Ask Questions

Adaptive and non-adaptive minimax rates for weighted Laplacian-eigenmap based nonparametric regression

Oct 31, 2023

Zhaoyang Shi, Krishnakumar Balasubramanian, Wolfgang Polonik

Abstract:We show both adaptive and non-adaptive minimax rates of convergence for a family of weighted Laplacian-Eigenmap based nonparametric regression methods, when the true regression function belongs to a Sobolev space and the sampling density is bounded from above and below. The adaptation methodology is based on extensions of Lepski's method and is over both the smoothness parameter ($s\in\mathbb{N}_{+}$) and the norm parameter ($M>0$) determining the constraints on the Sobolev space. Our results extend the non-adaptive result in \cite{green2021minimax}, established for a specific normalized graph Laplacian, to a wide class of weighted Laplacian matrices used in practice, including the unnormalized Laplacian and random walk Laplacian.

Via

Access Paper or Ask Questions

A Flexible Approach for Normal Approximation of Geometric and Topological Statistics

Oct 19, 2022

Zhaoyang Shi, Krishnakumar Balasubramanian, Wolfgang Polonik

Abstract:We derive normal approximation results for a class of stabilizing functionals of binomial or Poisson point process, that are not necessarily expressible as sums of certain score functions. Our approach is based on a flexible notion of the add-one cost operator, which helps one to deal with the second-order cost operator via suitably appropriate first-order operators. We combine this flexible notion with the theory of strong stabilization to establish our results. We illustrate the applicability of our results by establishing normal approximation results for certain geometric and topological statistics arising frequently in practice. Several existing results also emerge as special cases of our approach.

Via

Access Paper or Ask Questions

Topologically penalized regression on manifolds

Oct 26, 2021

Olympio Hacquard, Krishnakumar Balasubramanian, Gilles Blanchard, Wolfgang Polonik, Clément Levrard

Figure 1 for Topologically penalized regression on manifolds

Figure 2 for Topologically penalized regression on manifolds

Figure 3 for Topologically penalized regression on manifolds

Figure 4 for Topologically penalized regression on manifolds

Abstract:We study a regression problem on a compact manifold M. In order to take advantage of the underlying geometry and topology of the data, the regression task is performed on the basis of the first several eigenfunctions of the Laplace-Beltrami operator of the manifold, that are regularized with topological penalties. The proposed penalties are based on the topology of the sub-level sets of either the eigenfunctions or the estimated function. The overall approach is shown to yield promising and competitive performance on various applications to both synthetic and real data sets. We also provide theoretical guarantees on the regression function estimates, on both its prediction error and its smoothness (in a topological sense). Taken together, these results support the relevance of our approach in the case where the targeted function is "topologically smooth".

Via

Access Paper or Ask Questions

Algorithms for ridge estimation with convergence guarantees

Apr 26, 2021

Wanli Qiao, Wolfgang Polonik

Figure 1 for Algorithms for ridge estimation with convergence guarantees

Figure 2 for Algorithms for ridge estimation with convergence guarantees

Figure 3 for Algorithms for ridge estimation with convergence guarantees

Figure 4 for Algorithms for ridge estimation with convergence guarantees

Abstract:The extraction of filamentary structure from a point cloud is discussed. The filaments are modeled as ridge lines or higher dimensional ridges of an underlying density. We propose two novel algorithms, and provide theoretical guarantees for their convergences. We consider the new algorithms as alternatives to the Subspace Constraint Mean Shift (SCMS) algorithm that do not suffer from a shortcoming of the SCMS that is also revealed in this paper.

* 41 pages, 8 figures

Via

Access Paper or Ask Questions

Autism Spectrum Disorder Classification using Graph Kernels on Multidimensional Time Series

Nov 29, 2016

Rushil Anirudh, Jayaraman J. Thiagarajan, Irene Kim, Wolfgang Polonik

Figure 1 for Autism Spectrum Disorder Classification using Graph Kernels on Multidimensional Time Series

Figure 2 for Autism Spectrum Disorder Classification using Graph Kernels on Multidimensional Time Series

Abstract:We present an approach to model time series data from resting state fMRI for autism spectrum disorder (ASD) severity classification. We propose to adopt kernel machines and employ graph kernels that define a kernel dot product between two graphs. This enables us to take advantage of spatio-temporal information to capture the dynamics of the brain network, as opposed to aggregating them in the spatial or temporal dimension. In addition to the conventional similarity graphs, we explore the use of L1 graph using sparse coding, and the persistent homology of time delay embeddings, in the proposed pipeline for ASD classification. In our experiments on two datasets from the ABIDE collection, we demonstrate a consistent and significant advantage in using graph kernels over traditional linear or non linear kernels for a variety of time series features.

* Under review as a conference paper to BHI '17

Via

Access Paper or Ask Questions