Abstract:This paper proposes an asymptotic theory for online inference of the stochastic gradient descent (SGD) iterates with dropout regularization in linear regression. Specifically, we establish the geometric-moment contraction (GMC) for constant step-size SGD dropout iterates to show the existence of a unique stationary distribution of the dropout recursive function. By the GMC property, we provide quenched central limit theorems (CLT) for the difference between dropout and $\ell^2$-regularized iterates, regardless of initialization. The CLT for the difference between the Ruppert-Polyak averaged SGD (ASGD) with dropout and $\ell^2$-regularized iterates is also presented. Based on these asymptotic normality results, we further introduce an online estimator for the long-run covariance matrix of ASGD dropout to facilitate inference in a recursive manner with efficiency in computational time and memory. The numerical experiments demonstrate that for sufficiently large samples, the proposed confidence intervals for ASGD with dropout nearly achieve the nominal coverage probability.
Abstract:Uncertainty quantification for estimation through stochastic optimization solutions in an online setting has gained popularity recently. This paper introduces a novel inference method focused on constructing confidence intervals with efficient computation and fast convergence to the nominal level. Specifically, we propose to use a small number of independent multi-runs to acquire distribution information and construct a t-based confidence interval. Our method requires minimal additional computation and memory beyond the standard updating of estimates, making the inference process almost cost-free. We provide a rigorous theoretical guarantee for the confidence interval, demonstrating that the coverage is approximately exact with an explicit convergence rate and allowing for high confidence level inference. In particular, a new Gaussian approximation result is developed for the online estimators to characterize the coverage properties of our confidence intervals in terms of relative errors. Additionally, our method also allows for leveraging parallel computing to further accelerate calculations using multiple cores. It is easy to implement and can be integrated with existing stochastic algorithms without the need for complicated modifications.
Abstract:Stochastic Gradient Descent (SGD) is one of the simplest and most popular algorithms in modern statistical and machine learning due to its computational and memory efficiency. Various averaging schemes have been proposed to accelerate the convergence of SGD in different settings. In this paper, we explore a general averaging scheme for SGD. Specifically, we establish the asymptotic normality of a broad range of weighted averaged SGD solutions and provide asymptotically valid online inference approaches. Furthermore, we propose an adaptive averaging scheme that exhibits both optimal statistical rate and favorable non-asymptotic convergence, drawing insights from the optimal weight for the linear model in terms of non-asymptotic mean squared error (MSE).
Abstract:The broader ambition of this article is to popularize an approach for the fair distribution of the quantity of a system's output to its subsystems, while allowing for underlying complex subsystem level interactions. Particularly, we present a data-driven approach to vehicle price modeling and its component price estimation by leveraging a combination of concepts from machine learning and game theory. We show an alternative to common teardown methodologies and surveying approaches for component and vehicle price estimation at the manufacturer's suggested retail price (MSRP) level that has the advantage of bypassing the uncertainties involved in 1) the gathering of teardown data, 2) the need to perform expensive and biased surveying, and 3) the need to perform retail price equivalent (RPE) or indirect cost multiplier (ICM) adjustments to mark up direct manufacturing costs to MSRP. This novel exercise not only provides accurate pricing of the technologies at the customer level, but also shows the, a priori known, large gaps in pricing strategies between manufacturers, vehicle sizes, classes, market segments, and other factors. There is also clear synergism or interaction between the price of certain technologies and other specifications present in the same vehicle. Those (unsurprising) results are indication that old methods of manufacturer-level component costing, aggregation, and the application of a flat and rigid RPE or ICM adjustment factor should be carefully examined. The findings are based on an extensive database, developed by Argonne National Laboratory, that includes more than 64,000 vehicles covering MY1990 to MY2020 over hundreds of vehicle specs.
Abstract:Stochastic gradient descent (SGD) algorithm is widely used for parameter estimation especially in online setting. While this recursive algorithm is popular for computation and memory efficiency, the problem of quantifying variability and randomness of the solutions has been rarely studied. This paper aims at conducting statistical inference of SGD-based estimates in online setting. In particular, we propose a fully online estimator for the covariance matrix of averaged SGD iterates (ASGD). Based on the classic asymptotic normality results of ASGD, we construct asymptotically valid confidence intervals for model parameters. Upon receiving new observations, we can quickly update the covariance estimator and confidence intervals. This approach fits in online setting even if the total number of data is unknown and takes the full advantage of SGD: efficiency in both computation and memory.