Abstract:In display advertising, predicting the conversion rate, that is, the probability that a user takes a predefined action on an advertiser's website, such as purchasing goods is fundamental in estimating the value of displaying the advertisement. However, there is a relatively long time delay between a click and its resultant conversion. Because of the delayed feedback, some positive instances at the training period are labeled as negative because some conversions have not yet occurred when training data are gathered. As a result, the conditional label distributions differ between the training data and the production environment. This situation is referred to as a feedback shift. We address this problem by using an importance weight approach typically used for covariate shift correction. We prove its consistency for the feedback shift. Results in both offline and online experiments show that our proposed method outperforms the existing method.
Abstract:In display advertising, predicting the conversion rate, that is, the probability that a user takes a predefined action on an advertiser's website is fundamental in estimating the value of showing a user an advertisement. There are two troublesome difficulties in the conversion rate prediction due to the delayed feedback. First, some positive labels are not correctly observed in training data, because some conversions do not occur right after clicking the ads. Moreover, the delay mechanism is not uniform among instances; some positive feedback is much more frequently observed than the others. It is widely acknowledged that these problems cause a severe bias in the naive empirical average loss function for the conversion rate prediction. To overcome the challenges, we propose two unbiased estimators, one for the conversion rate prediction, and the other for the bias estimation. Subsequently, we propose an interactive learning algorithm named {\em Dual Learning Algorithm for Delayed Feedback (DLA-DF)} where a conversion rate predictor and a bias estimator are learned alternately. The proposed algorithm is the first of its kind to address the two major challenges in a theoretically principal way. Lastly, we conducted a simulation experiment to demonstrate that the proposed method outperforms the existing baselines and validate that the unbiased estimation approach is suitable for the delayed feedback problem.