Picture for Jean-Baptiste Gaya

Jean-Baptiste Gaya

Sid

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models

Add code
Nov 27, 2023
Viaarxiv icon

Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

Add code
Jun 07, 2023
Figure 1 for Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Figure 2 for Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Figure 3 for Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Figure 4 for Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Viaarxiv icon

Building a Subspace of Policies for Scalable Continual Learning

Add code
Nov 18, 2022
Viaarxiv icon

SaLinA: Sequential Learning of Agents

Add code
Oct 15, 2021
Figure 1 for SaLinA: Sequential Learning of Agents
Figure 2 for SaLinA: Sequential Learning of Agents
Figure 3 for SaLinA: Sequential Learning of Agents
Figure 4 for SaLinA: Sequential Learning of Agents
Viaarxiv icon

Learning a subspace of policies for online adaptation in Reinforcement Learning

Add code
Oct 11, 2021
Figure 1 for Learning a subspace of policies for online adaptation in Reinforcement Learning
Figure 2 for Learning a subspace of policies for online adaptation in Reinforcement Learning
Figure 3 for Learning a subspace of policies for online adaptation in Reinforcement Learning
Figure 4 for Learning a subspace of policies for online adaptation in Reinforcement Learning
Viaarxiv icon