Abstract:The method of synthetic controls is widely used for evaluating causal effects of policy changes in settings with observational data. Often, researchers aim to estimate the causal impact of policy interventions on a treated unit at an aggregate level while also possessing data at a finer granularity. In this article, we introduce the new disco command, which implements the Distributional Synthetic Controls method introduced in Gunsilius (2023). This command allows researchers to construct entire synthetic distributions for the treated unit based on an optimally weighted average of the distributions of the control units. Several aggregation schemes are provided to facilitate clear reporting of the distributional effects of the treatment. The package offers both quantile-based and CDF-based approaches, comprehensive inference procedures via bootstrap and permutation methods, and visualization capabilities. We empirically illustrate the use of the package by replicating the results in Van Dijcke et al. (2024).
Abstract:Thresholds in treatment assignments can produce discontinuities in outcomes, revealing causal insights. In many contexts, like geographic settings, these thresholds are unknown and multivariate. We propose a non-parametric method to estimate the resulting discontinuities by segmenting the regression surface into smooth and discontinuous parts. This estimator uses a convex relaxation of the Mumford-Shah functional, for which we establish identification and convergence. Using our method, we estimate that an internet shutdown in India resulted in a reduction of economic activity by over 50%, greatly surpassing previous estimates and shedding new light on the true cost of such shutdowns for digital economies globally.
Abstract:We develop a notion of projections between sets of probability measures using the geometric properties of the 2-Wasserstein space. It is designed for general multivariate probability measures, is computationally efficient to implement, and provides a unique solution in regular settings. The idea is to work on regular tangent cones of the Wasserstein space using generalized geodesics. Its structure and computational properties make the method applicable in a variety of settings, from causal inference to the analysis of object data. An application to estimating causal effects yields a generalization of the notion of synthetic controls to multivariate data with individual-level heterogeneity, as well as a way to estimate optimal weights jointly over all time periods.
Abstract:Matching on covariates is a well-established framework for estimating causal effects in observational studies. The principal challenge in these settings stems from the often high-dimensional structure of the problem. Many methods have been introduced to deal with this challenge, with different advantages and drawbacks in computational and statistical performance and interpretability. Moreover, the methodological focus has been on matching two samples in binary treatment scenarios, but a dedicated method that can optimally balance samples across multiple treatments has so far been unavailable. This article introduces a natural optimal matching method based on entropy-regularized multimarginal optimal transport that possesses many useful properties to address these challenges. It provides interpretable weights of matched individuals that converge at the parametric rate to the optimal weights in the population, can be efficiently implemented via the classical iterative proportional fitting procedure, and can even match several treatment arms simultaneously. It also possesses demonstrably excellent finite sample properties.