Picture for Sambit Sahu

Sambit Sahu

Continual Pre-training of MoEs: How robust is your router?

Add code
Mar 06, 2025
Viaarxiv icon

RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

Add code
Oct 05, 2024
Figure 1 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 2 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 3 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Figure 4 for RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Viaarxiv icon

Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey

Add code
Sep 17, 2024
Viaarxiv icon