Picture for Florian Strub

Florian Strub

TSP, IP Paris, SAMOVAR

Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier

Add code
Dec 05, 2024
Viaarxiv icon

Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion

Add code
Jun 27, 2024
Viaarxiv icon

Averaging log-likelihoods in direct alignment

Add code
Jun 27, 2024
Viaarxiv icon

Countering Reward Over-optimization in LLM with Demonstration-Guided Reinforcement Learning

Add code
Apr 30, 2024
Viaarxiv icon

Language Evolution with Deep Learning

Add code
Mar 18, 2024
Figure 1 for Language Evolution with Deep Learning
Figure 2 for Language Evolution with Deep Learning
Figure 3 for Language Evolution with Deep Learning
Figure 4 for Language Evolution with Deep Learning
Viaarxiv icon

Language Model Alignment with Elastic Reset

Add code
Dec 06, 2023
Viaarxiv icon

The Edge of Orthogonality: A Simple View of What Makes BYOL Tick

Add code
Feb 09, 2023
Viaarxiv icon

SemPPL: Predicting pseudo-labels for better contrastive representations

Add code
Jan 12, 2023
Viaarxiv icon

Over-communicate no more: Situated RL agents learn concise communication protocols

Add code
Nov 02, 2022
Viaarxiv icon

Emergent Communication: Generalization and Overfitting in Lewis Games

Add code
Sep 30, 2022
Figure 1 for Emergent Communication: Generalization and Overfitting in Lewis Games
Figure 2 for Emergent Communication: Generalization and Overfitting in Lewis Games
Figure 3 for Emergent Communication: Generalization and Overfitting in Lewis Games
Figure 4 for Emergent Communication: Generalization and Overfitting in Lewis Games
Viaarxiv icon