Picture for Ju-Seung Byun

Ju-Seung Byun

ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback

Add code
Jun 25, 2024
Viaarxiv icon

Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales

Add code
May 29, 2024
Viaarxiv icon

Reinforcement Learning for Fine-tuning Text-to-speech Diffusion Models

Add code
May 23, 2024
Viaarxiv icon

Normality-Guided Distributional Reinforcement Learning for Continuous Control

Add code
Aug 28, 2022
Figure 1 for Normality-Guided Distributional Reinforcement Learning for Continuous Control
Figure 2 for Normality-Guided Distributional Reinforcement Learning for Continuous Control
Figure 3 for Normality-Guided Distributional Reinforcement Learning for Continuous Control
Figure 4 for Normality-Guided Distributional Reinforcement Learning for Continuous Control
Viaarxiv icon

Training Transition Policies via Distribution Matching for Complex Tasks

Add code
Oct 08, 2021
Figure 1 for Training Transition Policies via Distribution Matching for Complex Tasks
Figure 2 for Training Transition Policies via Distribution Matching for Complex Tasks
Figure 3 for Training Transition Policies via Distribution Matching for Complex Tasks
Figure 4 for Training Transition Policies via Distribution Matching for Complex Tasks
Viaarxiv icon

Proximal Policy Gradient: PPO with Policy Gradient

Add code
Oct 20, 2020
Figure 1 for Proximal Policy Gradient: PPO with Policy Gradient
Figure 2 for Proximal Policy Gradient: PPO with Policy Gradient
Figure 3 for Proximal Policy Gradient: PPO with Policy Gradient
Figure 4 for Proximal Policy Gradient: PPO with Policy Gradient
Viaarxiv icon