Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Brieuc Pinon

Theoretical Barriers in Bellman-Based Reinforcement Learning

Feb 17, 2025

Brieuc Pinon, Raphaël Jungers, Jean-Charles Delvenne

Abstract:Reinforcement Learning algorithms designed for high-dimensional spaces often enforce the Bellman equation on a sampled subset of states, relying on generalization to propagate knowledge across the state space. In this paper, we identify and formalize a fundamental limitation of this common approach. Specifically, we construct counterexample problems with a simple structure that this approach fails to exploit. Our findings reveal that such algorithms can neglect critical information about the problems, leading to inefficiencies. Furthermore, we extend this negative result to another approach from the literature: Hindsight Experience Replay learning state-to-state reachability.

Via

Access Paper or Ask Questions

Efficiency Separation between RL Methods: Model-Free, Model-Based and Goal-Conditioned

Sep 28, 2023

Brieuc Pinon, Raphaël Jungers, Jean-Charles Delvenne

Abstract:We prove a fundamental limitation on the efficiency of a wide class of Reinforcement Learning (RL) algorithms. This limitation applies to model-free RL methods as well as a broad range of model-based methods, such as planning with tree search. Under an abstract definition of this class, we provide a family of RL problems for which these methods suffer a lower bound exponential in the horizon for their interactions with the environment to find an optimal behavior. However, there exists a method, not tailored to this specific family of problems, which can efficiently solve the problems in the family. In contrast, our limitation does not apply to several types of methods proposed in the literature, for instance, goal-conditioned methods or other algorithms that construct an inverse dynamics model.

Via

Access Paper or Ask Questions

A model-based approach to meta-Reinforcement Learning: Transformers and tree search

Aug 24, 2022

Brieuc Pinon, Jean-Charles Delvenne, Raphaël Jungers

Figure 1 for A model-based approach to meta-Reinforcement Learning: Transformers and tree search

Figure 2 for A model-based approach to meta-Reinforcement Learning: Transformers and tree search

Figure 3 for A model-based approach to meta-Reinforcement Learning: Transformers and tree search

Figure 4 for A model-based approach to meta-Reinforcement Learning: Transformers and tree search

Abstract:Meta-learning is a line of research that develops the ability to leverage past experiences to efficiently solve new learning problems. Meta-Reinforcement Learning (meta-RL) methods demonstrate a capability to learn behaviors that efficiently acquire and exploit information in several meta-RL problems. In this context, the Alchemy benchmark has been proposed by Wang et al. [2021]. Alchemy features a rich structured latent space that is challenging for state-of-the-art model-free RL methods. These methods fail to learn to properly explore then exploit. We develop a model-based algorithm. We train a model whose principal block is a Transformer Encoder to fit the symbolic Alchemy environment dynamics. Then we define an online planner with the learned model using a tree search method. This algorithm significantly outperforms previously applied model-free RL methods on the symbolic Alchemy problem. Our results reveal the relevance of model-based approaches with online planning to perform exploration and exploitation successfully in meta-RL. Moreover, we show the efficiency of the Transformer architecture to learn complex dynamics that arise from latent spaces present in meta-RL problems.

Via

Access Paper or Ask Questions

PAC-learning gains of Turing machines over circuits and neural networks

Mar 23, 2021

Brieuc Pinon, Jean-Charles Delvenne, Raphaël Jungers

Abstract:A caveat to many applications of the current Deep Learning approach is the need for large-scale data. One improvement suggested by Kolmogorov Complexity results is to apply the minimum description length principle with computationally universal models. We study the potential gains in sample efficiency that this approach can bring in principle. We use polynomial-time Turing machines to represent computationally universal models and Boolean circuits to represent Artificial Neural Networks (ANNs) acting on finite-precision digits. Our analysis unravels direct links between our question and Computational Complexity results. We provide lower and upper bounds on the potential gains in sample efficiency between the MDL applied with Turing machines instead of ANNs. Our bounds depend on the bit-size of the input of the Boolean function to be learned. Furthermore, we highlight close relationships between classical open problems in Circuit Complexity and the tightness of these.

Via

Access Paper or Ask Questions