Picture for Lizhu Zhang

Lizhu Zhang

EBPO: Empirical Bayes Shrinkage for Stabilizing Group-Relative Policy Optimization

Add code
Feb 05, 2026
Viaarxiv icon

Token-Level LLM Collaboration via FusionRoute

Add code
Jan 08, 2026
Viaarxiv icon

Efficient Sequential Recommendation for Long Term User Interest Via Personalization

Add code
Jan 07, 2026
Viaarxiv icon

Thought Communication in Multiagent Collaboration

Add code
Oct 23, 2025
Viaarxiv icon

Exploring System 1 and 2 communication for latent reasoning in LLMs

Add code
Oct 01, 2025
Viaarxiv icon

GEM: Empowering LLM for both Embedding Generation and Language Understanding

Add code
Jun 04, 2025
Viaarxiv icon

S'MoRE: Structural Mixture of Residual Experts for LLM Fine-tuning

Add code
Apr 08, 2025
Viaarxiv icon

CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning

Add code
Mar 25, 2025
Viaarxiv icon

Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment

Add code
Jan 16, 2025
Figure 1 for Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Figure 2 for Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Figure 3 for Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Figure 4 for Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
Viaarxiv icon