Picture for Huayu Chen

Huayu Chen

A Survey of Reinforcement Learning for Large Reasoning Models

Add code
Sep 10, 2025
Viaarxiv icon

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Add code
May 28, 2025
Viaarxiv icon

Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

Add code
May 23, 2025
Viaarxiv icon

Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Add code
Mar 18, 2025
Viaarxiv icon

Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator

Add code
Mar 03, 2025
Viaarxiv icon

Exploratory Diffusion Policy for Unsupervised Reinforcement Learning

Add code
Feb 11, 2025
Figure 1 for Exploratory Diffusion Policy for Unsupervised Reinforcement Learning
Figure 2 for Exploratory Diffusion Policy for Unsupervised Reinforcement Learning
Figure 3 for Exploratory Diffusion Policy for Unsupervised Reinforcement Learning
Figure 4 for Exploratory Diffusion Policy for Unsupervised Reinforcement Learning
Viaarxiv icon

Process Reinforcement through Implicit Rewards

Add code
Feb 03, 2025
Viaarxiv icon

Visual Generation Without Guidance

Add code
Jan 26, 2025
Figure 1 for Visual Generation Without Guidance
Figure 2 for Visual Generation Without Guidance
Figure 3 for Visual Generation Without Guidance
Figure 4 for Visual Generation Without Guidance
Viaarxiv icon

Free Process Rewards without Process Labels

Add code
Dec 02, 2024
Figure 1 for Free Process Rewards without Process Labels
Figure 2 for Free Process Rewards without Process Labels
Figure 3 for Free Process Rewards without Process Labels
Figure 4 for Free Process Rewards without Process Labels
Viaarxiv icon

Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment

Add code
Oct 12, 2024
Figure 1 for Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Figure 2 for Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Figure 3 for Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Figure 4 for Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment
Viaarxiv icon