Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ramazan Tarik Turksoy

The Effects of Data Split Strategies on the Offline Experiments for CTR Prediction

Jun 26, 2024

Ramazan Tarik Turksoy, Beyza Turkmen

Figure 1 for The Effects of Data Split Strategies on the Offline Experiments for CTR Prediction

Figure 2 for The Effects of Data Split Strategies on the Offline Experiments for CTR Prediction

Figure 3 for The Effects of Data Split Strategies on the Offline Experiments for CTR Prediction

Figure 4 for The Effects of Data Split Strategies on the Offline Experiments for CTR Prediction

Abstract:Click-through rate (CTR) prediction is a crucial task in online advertising to recommend products that users are likely to be interested in. To identify the best-performing models, rigorous model evaluation is necessary. Offline experimentation plays a significant role in selecting models for live user-item interactions, despite the value of online experimentation like A/B testing, which has its own limitations and risks. Often, the correlation between offline performance metrics and actual online model performance is inadequate. One main reason for this discrepancy is the common practice of using random splits to create training, validation, and test datasets in CTR prediction. In contrast, real-world CTR prediction follows a temporal order. Therefore, the methodology used in offline evaluation, particularly the data splitting strategy, is crucial. This study aims to address the inconsistency between current offline evaluation methods and real-world use cases, by focusing on data splitting strategies. To examine the impact of different data split strategies on offline performance, we conduct extensive experiments using both random and temporal splits on a large open benchmark dataset, Criteo.

Via

Access Paper or Ask Questions

Polyhedral Conic Classifier for CTR Prediction

Jun 06, 2024

Beyza Turkmen, Ramazan Tarik Turksoy, Hasan Saribas, Hakan Cevikalp

Abstract:This paper introduces a novel approach for click-through rate (CTR) prediction within industrial recommender systems, addressing the inherent challenges of numerical imbalance and geometric asymmetry. These challenges stem from imbalanced datasets, where positive (click) instances occur less frequently than negatives (non-clicks), and geometrically asymmetric distributions, where positive samples exhibit visually coherent patterns while negatives demonstrate greater diversity. To address these challenges, we have used a deep neural network classifier that uses the polyhedral conic functions. This classifier is similar to the one-class classifiers in spirit and it returns compact polyhedral acceptance regions to separate the positive class samples from the negative samples that have diverse distributions. Extensive experiments have been conducted to test the proposed approach using state-of-the-art (SOTA) CTR prediction models on four public datasets, namely Criteo, Avazu, MovieLens and Frappe. The experimental evaluations highlight the superiority of our proposed approach over Binary Cross Entropy (BCE) Loss, which is widely used in CTR prediction tasks.

Via

Access Paper or Ask Questions