Picture for Arash Ahmadian

Arash Ahmadian

Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier

Add code
Dec 05, 2024
Viaarxiv icon

If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs

Add code
Dec 05, 2024
Viaarxiv icon

Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning

Add code
Oct 14, 2024
Viaarxiv icon

RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs

Add code
Jul 02, 2024
Viaarxiv icon

Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion

Add code
Jun 27, 2024
Viaarxiv icon

Averaging log-likelihoods in direct alignment

Add code
Jun 27, 2024
Viaarxiv icon

The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm

Add code
Jun 26, 2024
Viaarxiv icon

Self-Improving Robust Preference Optimization

Add code
Jun 03, 2024
Viaarxiv icon

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Add code
Feb 26, 2024
Viaarxiv icon

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning

Add code
Sep 11, 2023
Viaarxiv icon