Picture for Hepeng Wang

Hepeng Wang

Consolidation or Adaptation? PRISM: Disentangling SFT and RL Data via Gradient Concentration

Add code
Jan 12, 2026
Viaarxiv icon

MAESTRO: Meta-learning Adaptive Estimation of Scalarization Trade-offs for Reward Optimization

Add code
Jan 12, 2026
Viaarxiv icon

Self-Evolving GPT: A Lifelong Autonomous Experiential Learner

Add code
Jul 12, 2024
Figure 1 for Self-Evolving GPT: A Lifelong Autonomous Experiential Learner
Figure 2 for Self-Evolving GPT: A Lifelong Autonomous Experiential Learner
Figure 3 for Self-Evolving GPT: A Lifelong Autonomous Experiential Learner
Figure 4 for Self-Evolving GPT: A Lifelong Autonomous Experiential Learner
Viaarxiv icon