Abstract:Learning the unknown causal parameters of a linear structural causal model is a fundamental task in causal analysis. The task, known as the problem of identification, asks to estimate the parameters of the model from a combination of assumptions on the graphical structure of the model and observational data, represented as a non-causal covariance matrix. In this paper, we give a new sound and complete algorithm for generic identification which runs in polynomial space. By standard simulation results, this algorithm has exponential running time which vastly improves the state-of-the-art double exponential time method using a Gr\"obner basis approach. The paper also presents evidence that parameter identification is computationally hard in general. In particular, we prove, that the task asking whether, for a given feasible correlation matrix, there are exactly one or two or more parameter sets explaining the observed matrix, is hard for $\forall R$, the co-class of the existential theory of the reals. In particular, this problem is $coNP$-hard. To our best knowledge, this is the first hardness result for some notion of identifiability.
Abstract:The framework of Pearl's Causal Hierarchy (PCH) formalizes three types of reasoning: observational, interventional, and counterfactual, that reflect the progressive sophistication of human thought regarding causation. We investigate the computational complexity aspects of reasoning in this framework focusing mainly on satisfiability problems expressed in probabilistic and causal languages across the PCH. That is, given a system of formulas in the standard probabilistic and causal languages, does there exist a model satisfying the formulas? The resulting complexity changes depending on the level of the hierarchy as well as the operators allowed in the formulas (addition, multiplication, or marginalization). We focus on formulas involving marginalization that are widely used in probabilistic and causal inference, but whose complexity issues are still little explored. Our main contribution are the exact computational complexity results showing that linear languages (allowing addition and marginalization) yield NP^PP-, PSPACE-, and NEXP-complete satisfiability problems, depending on the level of the PCH. Moreover, we prove that the problem for the full language (allowing additionally multiplication) is complete for the class succ$\exists$R for languages on the highest, counterfactual level. Previous work has shown that the satisfiability problem is complete for succ$\exists$R on the lower levels leaving the counterfactual case open. Finally, we consider constrained models that are restricted to a small polynomial size. The constraint on the size reduces the complexity of the interventional and counterfactual languages to NEXP-complete.
Abstract:Identifying and controlling bias is a key problem in empirical sciences. Causal diagram theory provides graphical criteria for deciding whether and how causal effects can be identified from observed (nonexperimental) data by covariate adjustment. Here we prove equivalences between existing as well as new criteria for adjustment and we provide a new simplified but still equivalent notion of d-separation. These lead to efficient algorithms for two important tasks in causal diagram analysis: (1) listing minimal covariate adjustments (with polynomial delay); and (2) identifying the subdiagram involved in biasing paths (in linear time). Our results improve upon existing exponential-time solutions for these problems, enabling users to assess the effects of covariate adjustment on diagrams with tens to hundreds of variables interactively in real time.