Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data

Oct 07, 2023

S. M. F. Sani, Seyed Abbas Hosseini, Hamid R. Rabiee

Figure 1 for Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data

Figure 2 for Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data

Figure 3 for Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data

Figure 4 for Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data

Share this with someone who'll enjoy it:

Abstract:Often, recommendation systems employ continuous training, leading to a self-feedback loop bias in which the system becomes biased toward its previous recommendations. Recent studies have attempted to mitigate this bias by collecting small amounts of unbiased data. While these studies have successfully developed less biased models, they ignore the crucial fact that the recommendations generated by the model serve as the training data for subsequent training sessions. To address this issue, we propose a framework that learns an unbiased estimator using a small amount of uniformly collected data and focuses on generating improved training data for subsequent training iterations. To accomplish this, we view recommendation as a contextual multi-arm bandit problem and emphasize on exploring items that the model has a limited understanding of. We introduce a new offline sequential training schema that simulates real-world continuous training scenarios in recommendation systems, offering a more appropriate framework for studying self-feedback bias. We demonstrate the superiority of our model over state-of-the-art debiasing methods by conducting extensive experiments using the proposed training schema.

View paper on

Share this with someone who'll enjoy it:

Title:Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data

Paper and Code