Picture for Acyr Locatelli

Acyr Locatelli

Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier

Add code
Dec 05, 2024
Viaarxiv icon

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Add code
Nov 19, 2024
Viaarxiv icon

Understanding Likelihood Over-optimisation in Direct Alignment Algorithms

Add code
Oct 15, 2024
Viaarxiv icon

Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts

Add code
Aug 28, 2024
Viaarxiv icon

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Add code
Aug 20, 2024
Figure 1 for To Code, or Not To Code? Exploring Impact of Code in Pre-training
Figure 2 for To Code, or Not To Code? Exploring Impact of Code in Pre-training
Figure 3 for To Code, or Not To Code? Exploring Impact of Code in Pre-training
Figure 4 for To Code, or Not To Code? Exploring Impact of Code in Pre-training
Viaarxiv icon

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

Add code
Aug 15, 2024
Figure 1 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 2 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 3 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 4 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Viaarxiv icon

Aya 23: Open Weight Releases to Further Multilingual Progress

Add code
May 23, 2024
Figure 1 for Aya 23: Open Weight Releases to Further Multilingual Progress
Figure 2 for Aya 23: Open Weight Releases to Further Multilingual Progress
Figure 3 for Aya 23: Open Weight Releases to Further Multilingual Progress
Figure 4 for Aya 23: Open Weight Releases to Further Multilingual Progress
Viaarxiv icon

SnapKV: LLM Knows What You are Looking for Before Generation

Add code
Apr 22, 2024
Viaarxiv icon

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning

Add code
Sep 11, 2023
Viaarxiv icon

Exploring Low Rank Training of Deep Neural Networks

Add code
Sep 27, 2022
Figure 1 for Exploring Low Rank Training of Deep Neural Networks
Figure 2 for Exploring Low Rank Training of Deep Neural Networks
Figure 3 for Exploring Low Rank Training of Deep Neural Networks
Figure 4 for Exploring Low Rank Training of Deep Neural Networks
Viaarxiv icon