Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andrea Marchini

A Hybrid Meta-Learning and Multi-Armed Bandit Approach for Context-Specific Multi-Objective Recommendation Optimization

Sep 13, 2024

Tiago Cunha, Andrea Marchini

Abstract:Recommender systems in online marketplaces face the challenge of balancing multiple objectives to satisfy various stakeholders, including customers, providers, and the platform itself. This paper introduces Juggler-MAB, a hybrid approach that combines meta-learning with Multi-Armed Bandits (MAB) to address the limitations of existing multi-stakeholder recommendation systems. Our method extends the Juggler framework, which uses meta-learning to predict optimal weights for utility and compensation adjustments, by incorporating a MAB component for real-time, context-specific refinements. We present a two-stage approach where Juggler provides initial weight predictions, followed by MAB-based adjustments that adapt to rapid changes in user behavior and market conditions. Our system leverages contextual features such as device type and brand to make fine-grained weight adjustments based on specific segments. To evaluate our approach, we developed a simulation framework using a dataset of 0.6 million searches from Expedia's lodging booking platform. Results show that Juggler-MAB outperforms the original Juggler model across all metrics, with NDCG improvements of 2.9%, a 13.7% reduction in regret, and a 9.8% improvement in best arm selection rate.

Via

Access Paper or Ask Questions

AdaptEx: A Self-Service Contextual Bandit Platform

Aug 08, 2023

William Black, Ercument Ilhan, Andrea Marchini, Vilda Markeviciute

Abstract:This paper presents AdaptEx, a self-service contextual bandit platform widely used at Expedia Group, that leverages multi-armed bandit algorithms to personalize user experiences at scale. AdaptEx considers the unique context of each visitor to select the optimal variants and learns quickly from every interaction they make. It offers a powerful solution to improve user experiences while minimizing the costs and time associated with traditional testing methods. The platform unlocks the ability to iterate towards optimal product solutions quickly, even in ever-changing content and continuous "cold start" situations gracefully.

Via

Access Paper or Ask Questions