Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Truncating Trajectories in Monte Carlo Reinforcement Learning

May 07, 2023

Riccardo Poiani, Alberto Maria Metelli, Marcello Restelli

Figure 1 for Truncating Trajectories in Monte Carlo Reinforcement Learning

Figure 2 for Truncating Trajectories in Monte Carlo Reinforcement Learning

Figure 3 for Truncating Trajectories in Monte Carlo Reinforcement Learning

Figure 4 for Truncating Trajectories in Monte Carlo Reinforcement Learning

Share this with someone who'll enjoy it:

Abstract:In Reinforcement Learning (RL), an agent acts in an unknown environment to maximize the expected cumulative discounted sum of an external reward signal, i.e., the expected return. In practice, in many tasks of interest, such as policy optimization, the agent usually spends its interaction budget by collecting episodes of fixed length within a simulator (i.e., Monte Carlo simulation). However, given the discounted nature of the RL objective, this data collection strategy might not be the best option. Indeed, the rewards taken in early simulation steps weigh exponentially more than future rewards. Taking a cue from this intuition, in this paper, we design an a-priori budget allocation strategy that leads to the collection of trajectories of different lengths, i.e., truncated. The proposed approach provably minimizes the width of the confidence intervals around the empirical estimates of the expected return of a policy. After discussing the theoretical properties of our method, we make use of our trajectory truncation mechanism to extend Policy Optimization via Importance Sampling (POIS, Metelli et al., 2018) algorithm. Finally, we conduct a numerical comparison between our algorithm and POIS: the results are consistent with our theory and show that an appropriate truncation of the trajectories can succeed in improving performance.

View paper on

Share this with someone who'll enjoy it:

Title:Truncating Trajectories in Monte Carlo Reinforcement Learning

Paper and Code