Abstract:In online platforms, the impact of a treatment on an observed outcome may change over time as 1) users learn about the intervention, and 2) the system personalization, such as individualized recommendations, change over time. We introduce a non-parametric causal model of user actions in a personalized system. We show that the Cookie-Cookie-Day (CCD) experiment, designed for the measurement of the user learning effect, is biased when there is personalization. We derive new experimental designs that intervene in the personalization system to generate the variation necessary to separately identify the causal effect mediated through user learning and personalization. Making parametric assumptions allows for the estimation of long-term causal effects based on medium-term experiments. In simulations, we show that our new designs successfully recover the dynamic causal effects of interest.
Abstract:There is increasing interest in using observed individual-level data to formulate personalized policy. Examples of this include heterogeneous pricing, individualized credit offers, and targeted social programs. This paper provides a general model of how personalized policy creates incentives for individuals to modify their behavior to obtain a better treatment. For a given planner objective, we show that standard estimators based on repeated risk minimization produce a suboptimal policy. We propose a dynamic experiment that estimates the optimal treatment allocation function when agents are strategic and has regret that decays at a linear rate. A key insight is that random variation in how treatment assignment depends on observed characteristics is required, and that randomized treatment assignment alone is not sufficient to identify the optimal policy. We show this experimental method outperforms alternative methods that do not learn strategic effects in simulations and in a small MTurk experiment.