Picture for Etienne Boursier

Etienne Boursier

INRIA Saclay

Optimal Design for Reward Modeling in RLHF

Add code
Oct 23, 2024
Viaarxiv icon

Simplicity bias and optimization threshold in two-layer ReLU networks

Add code
Oct 03, 2024
Viaarxiv icon

Learning to Mitigate Externalities: the Coase Theorem with Hindsight Rationality

Add code
Jul 03, 2024
Viaarxiv icon

Incentivized Learning in Principal-Agent Bandit Games

Add code
Mar 06, 2024
Figure 1 for Incentivized Learning in Principal-Agent Bandit Games
Figure 2 for Incentivized Learning in Principal-Agent Bandit Games
Figure 3 for Incentivized Learning in Principal-Agent Bandit Games
Figure 4 for Incentivized Learning in Principal-Agent Bandit Games
Viaarxiv icon

Early alignment in two-layer networks training is a two-edged sword

Add code
Jan 19, 2024
Viaarxiv icon

Approximate information maximization for bandit games

Add code
Oct 30, 2023
Viaarxiv icon

Constant or logarithmic regret in asynchronous multiplayer bandits

Add code
May 31, 2023
Figure 1 for Constant or logarithmic regret in asynchronous multiplayer bandits
Viaarxiv icon

Penalising the biases in norm regularisation enforces sparsity

Add code
Mar 02, 2023
Viaarxiv icon

Model agnostic methods meta-learn despite misspecifications

Add code
Mar 02, 2023
Viaarxiv icon

A survey on multi-player bandits

Add code
Nov 29, 2022
Viaarxiv icon