Picture for Zhijian Zhou

Zhijian Zhou

The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward

Add code
Sep 09, 2025
Viaarxiv icon

SDPO: Importance-Sampled Direct Preference Optimization for Stable Diffusion Training

Add code
May 28, 2025
Viaarxiv icon

ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools

Add code
May 27, 2025
Viaarxiv icon

Revisit Non-parametric Two-sample Testing as a Semi-supervised Learning Problem

Add code
Nov 30, 2024
Viaarxiv icon