Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Soochan Lee

Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning

Feb 26, 2025

Jaehyeon Son, Soochan Lee, Gunhee Kim

Abstract:Recent studies have shown that Transformers can perform in-context reinforcement learning (RL) by imitating existing RL algorithms, enabling sample-efficient adaptation to unseen tasks without parameter updates. However, these models also inherit the suboptimal behaviors of the RL algorithms they imitate. This issue primarily arises due to the gradual update rule employed by those algorithms. Model-based planning offers a promising solution to this limitation by allowing the models to simulate potential outcomes before taking action, providing an additional mechanism to deviate from the suboptimal behavior. Rather than learning a separate dynamics model, we propose Distillation for In-Context Planning (DICP), an in-context model-based RL framework where Transformers simultaneously learn environment dynamics and improve policy in-context. We evaluate DICP across a range of discrete and continuous environments, including Darkroom variants and Meta-World. Our results show that DICP achieves state-of-the-art performance while requiring significantly fewer environment interactions than baselines, which include both model-free counterparts and existing meta-RL methods.

* ICLR 2025

Via

Access Paper or Ask Questions

Learning to Continually Learn with the Bayesian Principle

May 29, 2024

Soochan Lee, Hyeonseong Jeon, Jaehyeon Son, Gunhee Kim

Abstract:In the present era of deep learning, continual learning research is mainly focused on mitigating forgetting when training a neural network with stochastic gradient descent on a non-stationary stream of data. On the other hand, in the more classical literature of statistical machine learning, many models have sequential Bayesian update rules that yield the same learning outcome as the batch training, i.e., they are completely immune to catastrophic forgetting. However, they are often overly simple to model complex real-world data. In this work, we adopt the meta-learning paradigm to combine the strong representational power of neural networks and simple statistical models' robustness to forgetting. In our novel meta-continual learning framework, continual learning takes place only in statistical models via ideal sequential Bayesian update rules, while neural networks are meta-learned to bridge the raw data and the statistical models. Since the neural networks remain fixed during continual learning, they are protected from catastrophic forgetting. This approach not only achieves significantly improved performance but also exhibits excellent scalability. Since our approach is domain-agnostic and model-agnostic, it can be applied to a wide range of problems and easily integrated with existing model architectures.

* ICML 2024

Via

Access Paper or Ask Questions

When Meta-Learning Meets Online and Continual Learning: A Survey

Nov 09, 2023

Jaehyeon Son, Soochan Lee, Gunhee Kim

Figure 1 for When Meta-Learning Meets Online and Continual Learning: A Survey

Figure 2 for When Meta-Learning Meets Online and Continual Learning: A Survey

Figure 3 for When Meta-Learning Meets Online and Continual Learning: A Survey

Figure 4 for When Meta-Learning Meets Online and Continual Learning: A Survey

Abstract:Over the past decade, deep neural networks have demonstrated significant success using the training scheme that involves mini-batch stochastic gradient descent on extensive datasets. Expanding upon this accomplishment, there has been a surge in research exploring the application of neural networks in other learning scenarios. One notable framework that has garnered significant attention is meta-learning. Often described as "learning to learn," meta-learning is a data-driven approach to optimize the learning algorithm. Other branches of interest are continual learning and online learning, both of which involve incrementally updating a model with streaming data. While these frameworks were initially developed independently, recent works have started investigating their combinations, proposing novel problem settings and learning algorithms. However, due to the elevated complexity and lack of unified terminology, discerning differences between the learning frameworks can be challenging even for experienced researchers. To facilitate a clear understanding, this paper provides a comprehensive survey that organizes various problem settings using consistent terminology and formal descriptions. By offering an overview of these learning paradigms, our work aims to foster further advancements in this promising area of research.

Via

Access Paper or Ask Questions

Recasting Continual Learning as Sequence Modeling

Oct 18, 2023

Soochan Lee, Jaehyeon Son, Gunhee Kim

Figure 1 for Recasting Continual Learning as Sequence Modeling

Figure 2 for Recasting Continual Learning as Sequence Modeling

Figure 3 for Recasting Continual Learning as Sequence Modeling

Figure 4 for Recasting Continual Learning as Sequence Modeling

Abstract:In this work, we aim to establish a strong connection between two significant bodies of machine learning research: continual learning and sequence modeling. That is, we propose to formulate continual learning as a sequence modeling problem, allowing advanced sequence models to be utilized for continual learning. Under this formulation, the continual learning process becomes the forward pass of a sequence model. By adopting the meta-continual learning (MCL) framework, we can train the sequence model at the meta-level, on multiple continual learning episodes. As a specific example of our new formulation, we demonstrate the application of Transformers and their efficient variants as MCL methods. Our experiments on seven benchmarks, covering both classification and regression, show that sequence models can be an attractive solution for general MCL.

* NeurIPS 2023

Via

Access Paper or Ask Questions

Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with Language Models

Jun 12, 2023

Soochan Lee, Gunhee Kim

Abstract:Generating intermediate steps, or Chain of Thought (CoT), is an effective way to significantly improve language models' (LM) multi-step reasoning capability. However, the CoT lengths can grow rapidly with the problem complexity, easily exceeding the maximum context size. Instead of increasing the context limit, which has already been heavily investigated, we explore an orthogonal direction: making LMs divide a problem into multiple contexts. We propose a new inference framework, called Recursion of Thought (RoT), which introduces several special tokens that the models can output to trigger context-related operations. Extensive experiments with multiple architectures including GPT-3 show that RoT dramatically improves LMs' inference capability to solve problems, whose solution consists of hundreds of thousands of tokens.

* ACL 2023 (short, findings)

Via

Access Paper or Ask Questions

A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning

Jan 14, 2020

Soochan Lee, Junsoo Ha, Dongsu Zhang, Gunhee Kim

Figure 1 for A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning

Figure 2 for A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning

Figure 3 for A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning

Figure 4 for A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning

Abstract:Despite the growing interest in continual learning, most of its contemporary works have been studied in a rather restricted setting where tasks are clearly distinguishable, and task boundaries are known during training. However, if our goal is to develop an algorithm that learns as humans do, this setting is far from realistic, and it is essential to develop a methodology that works in a task-free manner. Meanwhile, among several branches of continual learning, expansion-based methods have the advantage of eliminating catastrophic forgetting by allocating new resources to learn new data. In this work, we propose an expansion-based approach for task-free continual learning. Our model, named Continual Neural Dirichlet Process Mixture (CN-DPM), consists of a set of neural network experts that are in charge of a subset of the data. CN-DPM expands the number of experts in a principled way under the Bayesian nonparametric framework. With extensive experiments, we show that our model successfully performs task-free continual learning for both discriminative and generative tasks such as image classification and image generation.

* Accepted as a conference paper at ICLR 2020

Via

Access Paper or Ask Questions

Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation

Feb 25, 2019

Soochan Lee, Junsoo Ha, Gunhee Kim

Figure 1 for Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation

Figure 2 for Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation

Figure 3 for Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation

Figure 4 for Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation

Abstract:Recent advances in conditional image generation tasks, such as image-to-image translation and image inpainting, are largely accounted to the success of conditional GAN models, which are often optimized by the joint use of the GAN loss with the reconstruction loss. However, we reveal that this training recipe shared by almost all existing methods causes one critical side effect: lack of diversity in output samples. In order to accomplish both training stability and multimodal output generation, we propose novel training schemes with a new set of losses named moment reconstruction losses that simply replace the reconstruction loss. We show that our approach is applicable to any conditional generation tasks by performing thorough experiments on image-to-image translation, super-resolution and image inpainting using Cityscapes and CelebA dataset. Quantitative evaluations also confirm that our methods achieve a great diversity in outputs while retaining or even improving the visual fidelity of generated samples.

* Accepted as a conference paper at ICLR 2019

Via

Access Paper or Ask Questions