Abstract:In this paper, we investigate the identifiability of average controlled direct effects and average natural direct effects in causal systems represented by summary causal graphs, which are abstractions of full causal graphs, often used in dynamic systems where cycles and omitted temporal information complicate causal inference. Unlike in the traditional linear setting, where direct effects are typically easier to identify and estimate, non-parametric direct effects, which are crucial for handling real-world complexities, particularly in epidemiological contexts where relationships between variables (e.g, genetic, environmental, and behavioral factors) are often non-linear, are much harder to define and identify. In particular, we give sufficient conditions for identifying average controlled micro direct effect and average natural micro direct effect from summary causal graphs in the presence of hidden confounding. Furthermore, we show that the conditions given for the average controlled micro direct effect become also necessary in the setting where there is no hidden confounding and where we are only interested in identifiability by adjustment.
Abstract:Understanding causal relationships in dynamic systems is essential for numerous scientific fields, including epidemiology, economics, and biology. While causal inference methods have been extensively studied, they often rely on fully specified causal graphs, which may not always be available or practical in complex dynamic systems. Partially specified causal graphs, such as summary causal graphs (SCGs), provide a simplified representation of causal relationships, omitting temporal information and focusing on high-level causal structures. This simplification introduces new challenges concerning the types of queries of interest: macro queries, which involve relationships between clusters represented as vertices in the graph, and micro queries, which pertain to relationships between variables that are not directly visible through the vertices of the graph. In this paper, we first clearly distinguish between macro conditional independencies and micro conditional independencies and between macro total effects and micro total effects. Then, we demonstrate the soundness and completeness of the d-separation to identify macro conditional independencies in SCGs. Furthermore, we establish that the do-calculus is sound and complete for identifying macro total effects in SCGs. Conversely, we also show through various examples that these results do not hold when considering micro conditional independencies and micro total effects.
Abstract:Information technology (IT) systems are vital for modern businesses, handling data storage, communication, and process automation. Monitoring these systems is crucial for their proper functioning and efficiency, as it allows collecting extensive observational time series data for analysis. The interest in causal discovery is growing in IT monitoring systems as knowing causal relations between different components of the IT system helps in reducing downtime, enhancing system performance and identifying root causes of anomalies and incidents. It also allows proactive prediction of future issues through historical data analysis. Despite its potential benefits, applying causal discovery algorithms on IT monitoring data poses challenges, due to the complexity of the data. For instance, IT monitoring data often contains misaligned time series, sleeping time series, timestamp errors and missing values. This paper presents case studies on applying causal discovery algorithms to different IT monitoring datasets, highlighting benefits and ongoing challenges.
Abstract:Dynamic structural causal models (SCMs) are a powerful framework for reasoning in dynamic systems about direct effects which measure how a change in one variable affects another variable while holding all other variables constant. The causal relations in a dynamic structural causal model can be qualitatively represented with a full-time causal graph. Assuming linearity and causal sufficiency and given the full-time causal graph, the direct causal effect is always identifiable and can be estimated from data by adjusting on any set of variables given by the so-called single-door criterion. However, in many application such a graph is not available for various reasons but nevertheless experts have access to an abstraction of the full-time causal graph which represents causal relations between time series while omitting temporal information. This paper presents a complete identifiability result which characterizes all cases for which the direct effect is graphically identifiable from summary causal graphs and gives two sound finite adjustment sets that can be used to estimate the direct effect whenever it is identifiable.