Picture for Peter Chen

Peter Chen

Reward-free Alignment for Conflicting Objectives

Add code
Feb 02, 2026
Viaarxiv icon

GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

Add code
Dec 23, 2025
Viaarxiv icon

Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

Add code
Dec 21, 2025
Figure 1 for Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Figure 2 for Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Figure 3 for Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Figure 4 for Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Viaarxiv icon

Spectral Policy Optimization: Coloring your Incorrect Reasoning in GRPO

Add code
May 16, 2025
Viaarxiv icon

ComPO: Preference Alignment via Comparison Oracles

Add code
May 08, 2025
Viaarxiv icon

BIG-Bench Extra Hard

Add code
Feb 26, 2025
Viaarxiv icon

Geometric Framework for 3D Cell Segmentation Correction

Add code
Feb 03, 2025
Figure 1 for Geometric Framework for 3D Cell Segmentation Correction
Figure 2 for Geometric Framework for 3D Cell Segmentation Correction
Figure 3 for Geometric Framework for 3D Cell Segmentation Correction
Figure 4 for Geometric Framework for 3D Cell Segmentation Correction
Viaarxiv icon

Displacement-Sparse Neural Optimal Transport

Add code
Feb 03, 2025
Viaarxiv icon

Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities

Add code
May 29, 2024
Figure 1 for Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities
Figure 2 for Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities
Figure 3 for Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities
Figure 4 for Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities
Viaarxiv icon

LLMs cannot find reasoning errors, but can correct them!

Add code
Nov 14, 2023
Figure 1 for LLMs cannot find reasoning errors, but can correct them!
Figure 2 for LLMs cannot find reasoning errors, but can correct them!
Figure 3 for LLMs cannot find reasoning errors, but can correct them!
Figure 4 for LLMs cannot find reasoning errors, but can correct them!
Viaarxiv icon