Abstract:Online experimentation platforms collect user feedback at low cost and large scale. Some systems even support real-time or near real-time data processing, and can update metrics and statistics continuously. Many commonly used metrics, such as clicks and page views, can be observed without much delay. However, many important signals can only be observed after several hours or days, with noise adding up over the duration of the episode. When episodical outcomes follow a complex sequence of user-product interactions, it is difficult to understand which interactions lead to the final outcome. There is no obvious attribution logic for us to associate a positive or negative outcome back to the actions and choices we made at different times. This attribution logic is critical to unlocking more targeted and efficient measurement at a finer granularity that could eventually lead to the full capability of reinforcement learning. In this paper, we borrow the idea of Causal Surrogacy to model a long-term outcome using leading indicators that are incrementally observed and apply it as the value function to track the progress towards the final outcome and attribute incrementally to various user-product interaction steps. Applying this approach to the guest booking metric at Airbnb resulted in significant variance reductions of 50% to 85%, while aligning well with the booking metric itself. Continuous attribution allows us to assign a utility score to each product page-view, and this score can be flexibly further aggregated to a variety of units of interest, such as searches and listings. We provide multiple real-world applications of attribution to illustrate its versatility.