Picture for David Zhu

David Zhu

Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation

Add code
Oct 27, 2024
Figure 1 for Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation
Figure 2 for Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation
Figure 3 for Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation
Figure 4 for Rethinking Data Synthesis: A Teacher Model Training Recipe with Interpretation
Viaarxiv icon

Optimal Reward Labeling: Bridging Offline Preference and Reward-Based Reinforcement Learning

Add code
Jun 14, 2024
Viaarxiv icon