Picture for Yuda Song

Yuda Song

Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models

Add code
Dec 03, 2024
Viaarxiv icon

Hybrid Reinforcement Learning from Offline Observation Alone

Add code
Jun 11, 2024
Viaarxiv icon

Understanding Preference Fine-Tuning Through the Lens of Coverage

Add code
Jun 03, 2024
Viaarxiv icon

Rich-Observation Reinforcement Learning with Continuous Latent Dynamics

Add code
May 29, 2024
Viaarxiv icon

SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions

Add code
Mar 25, 2024
Viaarxiv icon

Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees

Add code
Nov 14, 2023
Viaarxiv icon

The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms

Add code
Mar 01, 2023
Viaarxiv icon

ClassPruning: Speed Up Image Restoration Networks by Dynamic N:M Pruning

Add code
Nov 10, 2022
Viaarxiv icon

Representation Learning for General-sum Low-rank Markov Games

Add code
Oct 30, 2022
Viaarxiv icon

Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient

Add code
Oct 13, 2022
Figure 1 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Figure 2 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Figure 3 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Figure 4 for Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Viaarxiv icon