Picture for Bibo Cai

Bibo Cai

Consolidation or Adaptation? PRISM: Disentangling SFT and RL Data via Gradient Concentration

Add code
Jan 12, 2026
Viaarxiv icon

MAESTRO: Meta-learning Adaptive Estimation of Scalarization Trade-offs for Reward Optimization

Add code
Jan 12, 2026
Viaarxiv icon

Precision over Diversity: High-Precision Reward Generalizes to Robust Instruction Following

Add code
Jan 08, 2026
Viaarxiv icon

ExpeTrans: LLMs Are Experiential Transfer Learners

Add code
May 29, 2025
Viaarxiv icon

Self-Route: Automatic Mode Switching via Capability Estimation for Efficient Reasoning

Add code
May 27, 2025
Viaarxiv icon

Towards Generalizable and Faithful Logic Reasoning over Natural Language via Resolution Refutation

Add code
Apr 03, 2024
Figure 1 for Towards Generalizable and Faithful Logic Reasoning over Natural Language via Resolution Refutation
Figure 2 for Towards Generalizable and Faithful Logic Reasoning over Natural Language via Resolution Refutation
Figure 3 for Towards Generalizable and Faithful Logic Reasoning over Natural Language via Resolution Refutation
Figure 4 for Towards Generalizable and Faithful Logic Reasoning over Natural Language via Resolution Refutation
Viaarxiv icon