Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sechan Oh

A Practical Method for Solving Contextual Bandit Problems Using Decision Trees

Oct 19, 2018

Adam N. Elmachtoub, Ryan McNellis, Sechan Oh, Marek Petrik

Figure 1 for A Practical Method for Solving Contextual Bandit Problems Using Decision Trees

Figure 2 for A Practical Method for Solving Contextual Bandit Problems Using Decision Trees

Figure 3 for A Practical Method for Solving Contextual Bandit Problems Using Decision Trees

Figure 4 for A Practical Method for Solving Contextual Bandit Problems Using Decision Trees

Abstract:Many efficient algorithms with strong theoretical guarantees have been proposed for the contextual multi-armed bandit problem. However, applying these algorithms in practice can be difficult because they require domain expertise to build appropriate features and to tune their parameters. We propose a new method for the contextual bandit problem that is simple, practical, and can be applied with little or no domain expertise. Our algorithm relies on decision trees to model the context-reward relationship. Decision trees are non-parametric, interpretable, and work well without hand-crafted features. To guide the exploration-exploitation trade-off, we use a bootstrapping approach which abstracts Thompson sampling to non-Bayesian settings. We also discuss several computational heuristics and demonstrate the performance of our method on several datasets.

* Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence (UAI 2017)

Via

Access Paper or Ask Questions

Building an Interpretable Recommender via Loss-Preserving Transformation

Jun 19, 2016

Amit Dhurandhar, Sechan Oh, Marek Petrik

Figure 1 for Building an Interpretable Recommender via Loss-Preserving Transformation

Figure 2 for Building an Interpretable Recommender via Loss-Preserving Transformation

Abstract:We propose a method for building an interpretable recommender system for personalizing online content and promotions. Historical data available for the system consists of customer features, provided content (promotions), and user responses. Unlike in a standard multi-class classification setting, misclassification costs depend on both recommended actions and customers. Our method transforms such a data set to a new set which can be used with standard interpretable multi-class classification algorithms. The transformation has the desirable property that minimizing the standard misclassification penalty in this new space is equivalent to minimizing the custom cost function.

* Presented at 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY

Via

Access Paper or Ask Questions