School of Information Technology and Mathematical Sciences, University of South Australia
Abstract:Instrumental variable (IV) is a powerful approach to inferring the causal effect of a treatment on an outcome of interest from observational data even when there exist latent confounders between the treatment and the outcome. However, existing IV methods require that an IV is selected and justified with domain knowledge. An invalid IV may lead to biased estimates. Hence, discovering a valid IV is critical to the applications of IV methods. In this paper, we study and design a data-driven algorithm to discover valid IVs from data under mild assumptions. We develop the theory based on partial ancestral graphs (PAGs) to support the search for a set of candidate Ancestral IVs (AIVs), and for each possible AIV, the identification of its conditioning set. Based on the theory, we propose a data-driven algorithm to discover a pair of IVs from data. The experiments on synthetic and real-world datasets show that the developed IV discovery algorithm estimates accurate estimates of causal effects in comparison with the state-of-the-art IV based causal effect estimators.
Abstract:Causal effect estimation from observational data is a crucial but challenging task. Currently, only a limited number of data-driven causal effect estimation methods are available. These methods either only provide a bound estimation of the causal effect of a treatment on the outcome, or have impractical assumptions on the data or low efficiency although providing a unique estimation of the causal effect. In this paper, we identify a practical problem setting and propose an approach to achieving unique causal effect estimation from data with hidden variables under this setting. For the approach, we develop the theorems to support the discovery of the proper covariate sets for confounding adjustment (adjustment sets). Based on the theorems, two algorithms are presented for finding the proper adjustment sets from data with hidden variables to obtain unbiased and unique causal effect estimation. Experiments with benchmark Bayesian networks and real-world datasets have demonstrated the efficiency and effectiveness of the proposed algorithms, indicating the practicability of the identified problem setting and the potential of the approach in real-world applications.