Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Cindy Trinh

ENS Paris Saclay

Towards Optimal Algorithms for Multi-Player Bandits without Collision Sensing Information

Mar 24, 2021

Wei Huang, Richard Combes, Cindy Trinh

Figure 1 for Towards Optimal Algorithms for Multi-Player Bandits without Collision Sensing Information

Figure 2 for Towards Optimal Algorithms for Multi-Player Bandits without Collision Sensing Information

Figure 3 for Towards Optimal Algorithms for Multi-Player Bandits without Collision Sensing Information

Figure 4 for Towards Optimal Algorithms for Multi-Player Bandits without Collision Sensing Information

Abstract:We propose a novel algorithm for multi-player multi-armed bandits without collision sensing information. Our algorithm circumvents two problems shared by all state-of-the-art algorithms: it does not need as an input a lower bound on the minimal expected reward of an arm, and its performance does not scale inversely proportionally to the minimal expected reward. We prove a theoretical regret upper bound to justify these claims. We complement our theoretical results with numerical experiments, showing that the proposed algorithm outperforms state-of-the-art in practice as well.

* 23 pages

Via

Access Paper or Ask Questions

A High Performance, Low Complexity Algorithm for Multi-Player Bandits Without Collision Sensing Information

Feb 19, 2021

Cindy Trinh, Richard Combes

Figure 1 for A High Performance, Low Complexity Algorithm for Multi-Player Bandits Without Collision Sensing Information

Figure 2 for A High Performance, Low Complexity Algorithm for Multi-Player Bandits Without Collision Sensing Information

Figure 3 for A High Performance, Low Complexity Algorithm for Multi-Player Bandits Without Collision Sensing Information

Figure 4 for A High Performance, Low Complexity Algorithm for Multi-Player Bandits Without Collision Sensing Information

Abstract:Motivated by applications in cognitive radio networks, we consider the decentralized multi-player multi-armed bandit problem, without collision nor sensing information. We propose Randomized Selfish KL-UCB, an algorithm with very low computational complexity, inspired by the Selfish KL-UCB algorithm, which has been abandoned as it provably performs sub-optimally in some cases. We subject Randomized Selfish KL-UCB to extensive numerical experiments showing that it far outperforms state-of-the-art algorithms in almost all environments, sometimes by several orders of magnitude, and without the additional knowledge required by state-of-the-art algorithms. We also emphasize the potential of this algorithm for the more realistic dynamic setting, and support our claims with further experiments. We believe that the low complexity and high performance of Randomized Selfish KL-UCB makes it the most suitable for implementation in practical systems amongst known algorithms.

* 14 pages

Via

Access Paper or Ask Questions

MLPerf Mobile Inference Benchmark: Why Mobile AI Benchmarking Is Hard and What to Do About It

Dec 03, 2020

Vijay Janapa Reddi, David Kanter, Peter Mattson, Jared Duke, Thai Nguyen, Ramesh Chukka, Kenneth Shiring, Koan-Sin Tan, Mark Charlebois, William Chou(+14 more)

Figure 1 for MLPerf Mobile Inference Benchmark: Why Mobile AI Benchmarking Is Hard and What to Do About It

Figure 2 for MLPerf Mobile Inference Benchmark: Why Mobile AI Benchmarking Is Hard and What to Do About It

Figure 3 for MLPerf Mobile Inference Benchmark: Why Mobile AI Benchmarking Is Hard and What to Do About It

Figure 4 for MLPerf Mobile Inference Benchmark: Why Mobile AI Benchmarking Is Hard and What to Do About It

Abstract:MLPerf Mobile is the first industry-standard open-source mobile benchmark developed by industry members and academic researchers to allow performance/accuracy evaluation of mobile devices with different AI chips and software stacks. The benchmark draws from the expertise of leading mobile-SoC vendors, ML-framework providers, and model producers. In this paper, we motivate the drive to demystify mobile-AI performance and present MLPerf Mobile's design considerations, architecture, and implementation. The benchmark comprises a suite of models that operate under standard models, data sets, quality metrics, and run rules. For the first iteration, we developed an app to provide an "out-of-the-box" inference-performance benchmark for computer vision and natural-language processing on mobile devices. MLPerf Mobile can serve as a framework for integrating future models, for customizing quality-target thresholds to evaluate system performance, for comparing software frameworks, and for assessing heterogeneous-hardware capabilities for machine learning, all fairly and faithfully with fully reproducible results.

Via

Access Paper or Ask Questions

Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling

Dec 06, 2019

Cindy Trinh, Emilie Kaufmann, Claire Vernade, Richard Combes

Figure 1 for Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling

Figure 2 for Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling

Figure 3 for Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling

Abstract:Stochastic Rank-One Bandits (Katarya et al, (2017a,b)) are a simple framework for regret minimization problems over rank-one matrices of arms. The initially proposed algorithms are proved to have logarithmic regret, but do not match the existing lower bound for this problem. We close this gap by first proving that rank-one bandits are a particular instance of unimodal bandits, and then providing a new analysis of Unimodal Thompson Sampling (UTS), initially proposed by Paladino et al (2017). We prove an asymptotically optimal regret bound on the frequentist regret of UTS and we support our claims with simulations showing the significant improvement of our method compared to the state-of-the-art.

Via

Access Paper or Ask Questions