Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dongsu Lee

Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning

May 19, 2025

Dongsu Lee, Minhae Kwon

Abstract:The goal of offline reinforcement learning (RL) is to extract a high-performance policy from the fixed datasets, minimizing performance degradation due to out-of-distribution (OOD) samples. Offline model-based RL (MBRL) is a promising approach that ameliorates OOD issues by enriching state-action transitions with augmentations synthesized via a learned dynamics model. Unfortunately, seminal offline MBRL methods often struggle in sparse-reward, long-horizon tasks. In this work, we introduce a novel MBRL framework, dubbed Temporal Distance-Aware Transition Augmentation (TempDATA), that generates augmented transitions in a temporally structured latent space rather than in raw state space. To model long-horizon behavior, TempDATA learns a latent abstraction that captures a temporal distance from both trajectory and transition levels of state space. Our experiments confirm that TempDATA outperforms previous offline MBRL methods and achieves matching or surpassing the performance of diffusion-based trajectory augmentation and goal-conditioned RL on the D4RL AntMaze, FrankaKitchen, CALVIN, and pixel-based FrankaKitchen.

* 2025 ICML

Via

Access Paper or Ask Questions

Episodic Future Thinking Mechanism for Multi-agent Reinforcement Learning

Oct 22, 2024

Dongsu Lee, Minhae Kwon

Figure 1 for Episodic Future Thinking Mechanism for Multi-agent Reinforcement Learning

Figure 2 for Episodic Future Thinking Mechanism for Multi-agent Reinforcement Learning

Figure 3 for Episodic Future Thinking Mechanism for Multi-agent Reinforcement Learning

Figure 4 for Episodic Future Thinking Mechanism for Multi-agent Reinforcement Learning

Abstract:Understanding cognitive processes in multi-agent interactions is a primary goal in cognitive science. It can guide the direction of artificial intelligence (AI) research toward social decision-making in multi-agent systems, which includes uncertainty from character heterogeneity. In this paper, we introduce an episodic future thinking (EFT) mechanism for a reinforcement learning (RL) agent, inspired by cognitive processes observed in animals. To enable future thinking functionality, we first develop a multi-character policy that captures diverse characters with an ensemble of heterogeneous policies. Here, the character of an agent is defined as a different weight combination on reward components, representing distinct behavioral preferences. The future thinking agent collects observation-action trajectories of the target agents and uses the pre-trained multi-character policy to infer their characters. Once the character is inferred, the agent predicts the upcoming actions of target agents and simulates the potential future scenario. This capability allows the agent to adaptively select the optimal action, considering the predicted future scenario in multi-agent interactions. To evaluate the proposed mechanism, we consider the multi-agent autonomous driving scenario with diverse driving traits and multiple particle environments. Simulation results demonstrate that the EFT mechanism with accurate character inference leads to a higher reward than existing multi-agent solutions. We also confirm that the effect of reward improvement remains valid across societies with different levels of character diversity.

* NeurIPS 2024 (Web: https://sites.google.com/view/eftm-neurips2024)

Via

Access Paper or Ask Questions

AD4RL: Autonomous Driving Benchmarks for Offline Reinforcement Learning with Value-based Dataset

Apr 03, 2024

Dongsu Lee, Chanin Eom, Minhae Kwon

Abstract:Offline reinforcement learning has emerged as a promising technology by enhancing its practicality through the use of pre-collected large datasets. Despite its practical benefits, most algorithm development research in offline reinforcement learning still relies on game tasks with synthetic datasets. To address such limitations, this paper provides autonomous driving datasets and benchmarks for offline reinforcement learning research. We provide 19 datasets, including real-world human driver's datasets, and seven popular offline reinforcement learning algorithms in three realistic driving scenarios. We also provide a unified decision-making process model that can operate effectively across different scenarios, serving as a reference framework in algorithm design. Our research lays the groundwork for further collaborations in the community to explore practical aspects of existing reinforcement learning methods. Dataset and codes can be found in https://sites.google.com/view/ad4rl.

* ICRA 2024 Website at: https://sites.google.com/view/ad4rl

Via

Access Paper or Ask Questions

A Lightweight Speaker Recognition System Using Timbre Properties

Oct 13, 2020

Abu Quwsar Ohi, M. F. Mridha, Md. Abdul Hamid, Muhammad Mostafa Monowar, Dongsu Lee, Jinsul Kim

Figure 1 for A Lightweight Speaker Recognition System Using Timbre Properties

Figure 2 for A Lightweight Speaker Recognition System Using Timbre Properties

Figure 3 for A Lightweight Speaker Recognition System Using Timbre Properties

Figure 4 for A Lightweight Speaker Recognition System Using Timbre Properties

Abstract:Speaker recognition is an active research area that contains notable usage in biometric security and authentication system. Currently, there exist many well-performing models in the speaker recognition domain. However, most of the advanced models implement deep learning that requires GPU support for real-time speech recognition, and it is not suitable for low-end devices. In this paper, we propose a lightweight text-independent speaker recognition model based on random forest classifier. It also introduces new features that are used for both speaker verification and identification tasks. The proposed model uses human speech based timbral properties as features that are classified using random forest. Timbre refers to the very basic properties of sound that allow listeners to discriminate among them. The prototype uses seven most actively searched timbre properties, boominess, brightness, depth, hardness, roughness, sharpness, and warmth as features of our speaker recognition model. The experiment is carried out on speaker verification and speaker identification tasks and shows the achievements and drawbacks of the proposed model. In the speaker identification phase, it achieves a maximum accuracy of 78%. On the contrary, in the speaker verification phase, the model maintains an accuracy of 80% having an equal error rate (ERR) of 0.24.

* The Journal of Contents Computing 2, no. 1 (2020): 139-151
* Accepted in Journal of Contents Computing

Via

Access Paper or Ask Questions