Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mohammadreza Nakhaei

Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning

Jun 12, 2024

Mohammadreza Nakhaei, Aidan Scannell, Joni Pajarinen

Figure 1 for Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning

Figure 2 for Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning

Figure 3 for Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning

Figure 4 for Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning

Abstract:Offline reinforcement learning (RL) allows learning sequential behavior from fixed datasets. Since offline datasets do not cover all possible situations, many methods collect additional data during online fine-tuning to improve performance. In general, these methods assume that the transition dynamics remain the same during both the offline and online phases of training. However, in many real-world applications, such as outdoor construction and navigation over rough terrain, it is common for the transition dynamics to vary between the offline and online phases. Moreover, the dynamics may vary during the online fine-tuning. To address this problem of changing dynamics from offline to online RL we propose a residual learning approach that infers dynamics changes to correct the outputs of the offline solution. At the online fine-tuning phase, we train a context encoder to learn a representation that is consistent inside the current online learning environment while being able to predict dynamic transitions. Experiments in D4RL MuJoCo environments, modified to support dynamics' changes upon environment resets, show that our approach can adapt to these dynamic changes and generalize to unseen perturbations in a sample-efficient way, whilst comparison methods cannot.

* 10 pages, 5 figures, 1 table. Accepted at L4DC 2024

Via

Access Paper or Ask Questions

iQRL -- Implicitly Quantized Representations for Sample-efficient Reinforcement Learning

Jun 04, 2024

Aidan Scannell, Kalle Kujanpää, Yi Zhao, Mohammadreza Nakhaei, Arno Solin, Joni Pajarinen

Figure 1 for iQRL -- Implicitly Quantized Representations for Sample-efficient Reinforcement Learning

Figure 2 for iQRL -- Implicitly Quantized Representations for Sample-efficient Reinforcement Learning

Figure 3 for iQRL -- Implicitly Quantized Representations for Sample-efficient Reinforcement Learning

Figure 4 for iQRL -- Implicitly Quantized Representations for Sample-efficient Reinforcement Learning

Abstract:Learning representations for reinforcement learning (RL) has shown much promise for continuous control. We propose an efficient representation learning method using only a self-supervised latent-state consistency loss. Our approach employs an encoder and a dynamics model to map observations to latent states and predict future latent states, respectively. We achieve high performance and prevent representation collapse by quantizing the latent representation such that the rank of the representation is empirically preserved. Our method, named iQRL: implicitly Quantized Reinforcement Learning, is straightforward, compatible with any model-free RL algorithm, and demonstrates excellent performance by outperforming other recently proposed representation learning methods in continuous control benchmarks from DeepMind Control Suite.

* 9 pages, 11 figures

Via

Access Paper or Ask Questions