Picture for Yige Yuan

Yige Yuan

From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment

Add code
Jun 14, 2025
Viaarxiv icon

Incentivizing Strong Reasoning from Weak Supervision

Add code
May 28, 2025
Viaarxiv icon

Incentivizing Reasoning from Weak Supervision

Add code
May 26, 2025
Viaarxiv icon

Inference-time Alignment in Continuous Space

Add code
May 26, 2025
Viaarxiv icon

InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning

Add code
May 07, 2025
Viaarxiv icon

On a Connection Between Imitation Learning and RLHF

Add code
Mar 07, 2025
Viaarxiv icon

MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing

Add code
Feb 28, 2025
Viaarxiv icon

SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

Add code
Feb 04, 2025
Figure 1 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 2 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 3 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 4 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Viaarxiv icon

Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment

Add code
Dec 19, 2024
Viaarxiv icon

Fact-Level Confidence Calibration and Self-Correction

Add code
Nov 20, 2024
Figure 1 for Fact-Level Confidence Calibration and Self-Correction
Figure 2 for Fact-Level Confidence Calibration and Self-Correction
Figure 3 for Fact-Level Confidence Calibration and Self-Correction
Figure 4 for Fact-Level Confidence Calibration and Self-Correction
Viaarxiv icon