Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mahsa Asadi

Collaborative Algorithms for Online Personalized Mean Estimation

Aug 24, 2022

Mahsa Asadi, Aurélien Bellet, Odalric-Ambrym Maillard, Marc Tommasi

Figure 1 for Collaborative Algorithms for Online Personalized Mean Estimation

Figure 2 for Collaborative Algorithms for Online Personalized Mean Estimation

Figure 3 for Collaborative Algorithms for Online Personalized Mean Estimation

Figure 4 for Collaborative Algorithms for Online Personalized Mean Estimation

Abstract:We consider an online estimation problem involving a set of agents. Each agent has access to a (personal) process that generates samples from a real-valued distribution and seeks to estimate its mean. We study the case where some of the distributions have the same mean, and the agents are allowed to actively query information from other agents. The goal is to design an algorithm that enables each agent to improve its mean estimate thanks to communication with other agents. The means as well as the number of distributions with same mean are unknown, which makes the task nontrivial. We introduce a novel collaborative strategy to solve this online personalized mean estimation problem. We analyze its time complexity and introduce variants that enjoy good performance in numerical experiments. We also extend our approach to the setting where clusters of agents with similar means seek to estimate the mean of their cluster.

Via

Access Paper or Ask Questions

Model-Based Reinforcement Learning Exploiting State-Action Equivalence

Oct 09, 2019

Mahsa Asadi, Mohammad Sadegh Talebi, Hippolyte Bourel, Odalric-Ambrym Maillard

Figure 1 for Model-Based Reinforcement Learning Exploiting State-Action Equivalence

Figure 2 for Model-Based Reinforcement Learning Exploiting State-Action Equivalence

Figure 3 for Model-Based Reinforcement Learning Exploiting State-Action Equivalence

Figure 4 for Model-Based Reinforcement Learning Exploiting State-Action Equivalence

Abstract:Leveraging an equivalence property in the state-space of a Markov Decision Process (MDP) has been investigated in several studies. This paper studies equivalence structure in the reinforcement learning (RL) setup, where transition distributions are no longer assumed to be known. We present a notion of similarity between transition probabilities of various state-action pairs of an MDP, which naturally defines an equivalence structure in the state-action space. We present equivalence-aware confidence sets for the case where the learner knows the underlying structure in advance. These sets are provably smaller than their corresponding equivalence-oblivious counterparts. In the more challenging case of an unknown equivalence structure, we present an algorithm called ApproxEquivalence that seeks to find an (approximate) equivalence structure, and define confidence sets using the approximate equivalence. To illustrate the efficacy of the presented confidence sets, we present C-UCRL, as a natural modification of UCRL2 for RL in undiscounted MDPs. In the case of a known equivalence structure, we show that C-UCRL improves over UCRL2 in terms of regret by a factor of $\sqrt{SA/C}$, in any communicating MDP with $S$ states, $A$ actions, and $C$ classes, which corresponds to a massive improvement when $C \ll SA$. To the best of our knowledge, this is the first work providing regret bounds for RL when an equivalence structure in the MDP is efficiently exploited. In the case of an unknown equivalence structure, we show through numerical experiments that C-UCRL combined with ApproxEquivalence outperforms UCRL2 in ergodic MDPs.

* ACML 2019. Recipient of the Best Student Paper Award

Via

Access Paper or Ask Questions