Abstract:The dishonest casino is a well-known hidden Markov model (HMM) used in educational settings to introduce HMMs and graphical models. Here, a sequence of die rolls is observed, with the casino switching between a fair and a loaded die. Typically, the goal is to use the observed rolls to infer the pattern of fair and loaded dice, leading to filtering, smoothing, and Viterbi algorithms. This paper, however, explores how much of the winnings is attributable to the casino's cheating, a counterfactual question beyond the scope of HMM primitives. To address this, we introduce a structural causal model (SCM) consistent with the HMM and show that the expected winnings attributable to cheating (EWAC) can be bounded using linear programs (LPs). Through numerical experiments, we compute these bounds and develop intuition using benchmark SCMs based on independence, comonotonic, and counter-monotonic copulas. We show that tighter bounds are obtained with a time-homogeneity condition on the SCM, while looser bounds allow for an almost explicit LP solution. Domain-specific knowledge like pathwise monotonicity or counterfactual stability can be incorporated via linear constraints. Our work contributes to bounding counterfactuals in causal inference and is the first to develop LP bounds in a dynamic HMM setting, benefiting educational contexts where counterfactual inference is taught.
Abstract:We provide an explicit model of the causal mechanism in a structural causal model (SCM) with the goal of estimating counterfactual quantities of interest (CQIs). We propose some standard dependence structures, i.e. copulas, as base cases for the causal mechanism. While these base cases can be used to construct more interesting copulas, there are uncountably many copulas in general and so we formulate optimization problems for bounding the CQIs. As our ultimate goal is counterfactual reasoning in dynamic models which may have latent-states, we show by way of example that filtering / smoothing / sampling methods for these models can be integrated with our modeling of the causal mechanism. Specifically, we consider the "cheating-at-the-casino" application of a hidden Markov model and use linear programming (LP) to construct lower and upper bounds on the casino's winnings due to cheating. These bounds are considerably tighter when we constrain the copulas in the LPs to be time-independent. We can characterize the entire space of SCMs obeying counterfactual stability (CS), and we use it to negatively answer the open question of Oberst and Sontag [18] regarding the uniqueness of the Gumbel-max mechanism for modeling CS. Our work has applications in epidemiology and legal reasoning, and more generally in counterfactual off-policy evaluation, a topic of increasing interest in the reinforcement learning community.