Abstract:We study the problem of a decision maker who must provide the best possible treatment recommendation based on an experiment. The desirability of the outcome distribution resulting from the policy recommendation is measured through a functional capturing the distributional characteristic that the decision maker is interested in optimizing. This could be, e.g., its inherent inequality, welfare, level of poverty or its distance to a desired outcome distribution. If the functional of interest is not quasi-convex or if there are constraints, the optimal recommendation may be a mixture of treatments. This vastly expands the set of recommendations that must be considered. We characterize the difficulty of the problem by obtaining maximal expected regret lower bounds. Furthermore, we propose two regret-optimal policies. The first policy is static and thus applicable irrespectively of the subjects arriving sequentially or not in the course of the experimental phase. The second policy can utilize that subjects arrive sequentially by successively eliminating inferior treatments and thus spends the sampling effort where it is most needed.
Abstract:We consider a multi-armed bandit problem with covariates. Given a realization of the covariate vector, instead of targeting the treatment with highest conditional expectation, the decision maker targets the treatment which maximizes a general functional of the conditional potential outcome distribution, e.g., a conditional quantile, trimmed mean, or a socio-economic functional such as an inequality, welfare or poverty measure. We develop expected regret lower bounds for this problem, and construct a near minimax optimal assignment policy.
Abstract:In treatment allocation problems the individuals to be treated often arrive sequentially. We study a problem in which the policy maker is not only interested in the expected cumulative welfare but is also concerned about the uncertainty/risk of the treatment outcomes. At the outset, the total number of treatment assignments to be made may even be unknown. A sequential treatment policy which attains the minimax optimal regret is proposed. We also demonstrate that the expected number of suboptimal treatments only grows slowly in the number of treatments. Finally, we study a setting where outcomes are only observed with delay.
Abstract:This paper establishes non-asymptotic oracle inequalities for the prediction error and estimation accuracy of the LASSO in stationary vector autoregressive models. These inequalities are used to establish consistency of the LASSO even when the number of parameters is of a much larger order of magnitude than the sample size. We also give conditions under which no relevant variables are excluded. Next, non-asymptotic probabilities are given for the Adaptive LASSO to select the correct sparsity pattern. We then give conditions under which the Adaptive LASSO reveals the correct sparsity pattern asymptotically. We establish that the estimates of the non-zero coefficients are asymptotically equivalent to the oracle assisted least squares estimator. This is used to show that the rate of convergence of the estimates of the non-zero coefficients is identical to the one of least squares only including the relevant covariates.
Abstract:This paper consider penalized empirical loss minimization of convex loss functions with unknown non-linear target functions. Using the elastic net penalty we establish a finite sample oracle inequality which bounds the loss of our estimator from above with high probability. If the unknown target is linear this inequality also provides an upper bound of the estimation error of the estimated parameter vector. These are new results and they generalize the econometrics and statistics literature. Next, we use the non-asymptotic results to show that the excess loss of our estimator is asymptotically of the same order as that of the oracle. If the target is linear we give sufficient conditions for consistency of the estimated parameter vector. Next, we briefly discuss how a thresholded version of our estimator can be used to perform consistent variable selection. We give two examples of loss functions covered by our framework and show how penalized nonparametric series estimation is contained as a special case and provide a finite sample upper bound on the mean square error of the elastic net series estimator.