Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Deep Pareto Reinforcement Learning for Multi-Objective Recommender System

Jul 04, 2024

Pan Li, Alexander Tuzhilin

Share this with someone who'll enjoy it:

Abstract:Optimizing multiple objectives simultaneously is an important task in recommendation platforms to improve their performance on different fronts. However, this task is particularly challenging since the relationships between different objectives are heterogeneous across different consumers and dynamically fluctuating according to different contexts. Especially in those cases when objectives become conflicting with each other, the result of recommendations will form a pareto-frontier, where the improvements on any objective comes at the cost of a performance decrease in another objective. Unfortunately, existing multi-objective recommender systems do not systematically consider such relationships; instead, they balance between these objectives in a static and uniform manner, resulting in performance that is significantly worse than the pareto-optimality. In this paper, we propose a Deep Pareto Reinforcement Learning (DeepPRL) approach, where we (1) comprehensively model the complex relationships between multiple objectives in recommendations; (2) effectively capture the personalized and contextual consumer preference towards each objective and update the recommendations correspondingly; (3) optimize both the short-term and the long-term performance of multi-objective recommendations. As a result, our method achieves significant pareto-dominance over state-of-the-art baselines in extensive offline experiments conducted on three real-world datasets. Furthermore, we conduct a large-scale online controlled experiment at the video streaming platform of Alibaba, where our method simultaneously improves the three conflicting objectives of Click-Through Rate, Video View, and Dwell Time by 2%, 5%, and 7% respectively over the latest production system, demonstrating its tangible economic impact in industrial applications.

View paper on

Share this with someone who'll enjoy it:

Title:Deep Pareto Reinforcement Learning for Multi-Objective Recommender System

Paper and Code