Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tim Pychynski

CausalMan: A physics-based simulator for large-scale causality

Feb 18, 2025

Nicholas Tagliapietra, Juergen Luettin, Lavdim Halilaj, Moritz Willig, Tim Pychynski, Kristian Kersting

Abstract:A comprehensive understanding of causality is critical for navigating and operating within today's complex real-world systems. The absence of realistic causal models with known data generating processes complicates fair benchmarking. In this paper, we present the CausalMan simulator, modeled after a real-world production line. The simulator features a diverse range of linear and non-linear mechanisms and challenging-to-predict behaviors, such as discrete mode changes. We demonstrate the inadequacy of many state-of-the-art approaches and analyze the significant differences in their performance and tractability, both in terms of runtime and memory complexity. As a contribution, we will release the CausalMan large-scale simulator. We present two derived datasets, and perform an extensive evaluation of both.

Via

Access Paper or Ask Questions

$\texttt{causalAssembly}$: Generating Realistic Production Data for Benchmarking Causal Discovery

Jun 19, 2023

Konstantin Göbler, Tobias Windisch, Tim Pychynski, Steffen Sonntag, Martin Roth, Mathias Drton

$Figure 1 for $\texttt{causalAssembly}$: Generating Realistic Production Data for Benchmarking Causal Discovery$

$Figure 2 for $\texttt{causalAssembly}$: Generating Realistic Production Data for Benchmarking Causal Discovery$

$Figure 3 for $\texttt{causalAssembly}$: Generating Realistic Production Data for Benchmarking Causal Discovery$

$Figure 4 for $\texttt{causalAssembly}$: Generating Realistic Production Data for Benchmarking Causal Discovery$

Abstract:Algorithms for causal discovery have recently undergone rapid advances and increasingly draw on flexible nonparametric methods to process complex data. With these advances comes a need for adequate empirical validation of the causal relationships learned by different algorithms. However, for most real data sources true causal relations remain unknown. This issue is further compounded by privacy concerns surrounding the release of suitable high-quality data. To help address these challenges, we gather a complex dataset comprising measurements from an assembly line in a manufacturing context. This line consists of numerous physical processes for which we are able to provide ground truth causal relationships on the basis of a detailed study of the underlying physics. We use the assembly line data and associated ground truth information to build a system for generation of semisynthetic manufacturing data that supports benchmarking of causal discovery methods. To accomplish this, we employ distributional random forests in order to flexibly estimate and represent conditional distributions that may be combined into joint distributions that strictly adhere to a causal model over the observed variables. The estimated conditionals and tools for data generation are made available in our Python library $\texttt{causalAssembly}$. Using the library, we showcase how to benchmark several well-known causal discovery algorithms.

Via

Access Paper or Ask Questions