Picture for Zhuokai Zhao

Zhuokai Zhao

Boosting LLM Reasoning via Spontaneous Self-Correction

Add code
Jun 07, 2025
Viaarxiv icon

DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data

Add code
May 21, 2025
Viaarxiv icon

S'MoRE: Structural Mixture of Residual Experts for LLM Fine-tuning

Add code
Apr 08, 2025
Viaarxiv icon

Transfer between Modalities with MetaQueries

Add code
Apr 08, 2025
Viaarxiv icon

CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning

Add code
Mar 25, 2025
Viaarxiv icon

HumanMM: Global Human Motion Recovery from Multi-shot Videos

Add code
Mar 10, 2025
Viaarxiv icon

Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment

Add code
Jan 16, 2025
Figure 1 for Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Figure 2 for Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Figure 3 for Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Figure 4 for Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Viaarxiv icon

From Uncertainty to Trust: Enhancing Reliability in Vision-Language Models with Uncertainty-Guided Dropout Decoding

Add code
Dec 09, 2024
Viaarxiv icon

Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding

Add code
Nov 21, 2024
Viaarxiv icon

Preference Optimization with Multi-Sample Comparisons

Add code
Oct 16, 2024
Viaarxiv icon