Picture for Matthieu Geist

Matthieu Geist

INRIA Lorraine - LORIA

Understanding Likelihood Over-optimisation in Direct Alignment Algorithms

Add code
Oct 15, 2024
Viaarxiv icon

Solving robust MDPs as a sequence of static RL problems

Add code
Oct 08, 2024
Figure 1 for Solving robust MDPs as a sequence of static RL problems
Figure 2 for Solving robust MDPs as a sequence of static RL problems
Figure 3 for Solving robust MDPs as a sequence of static RL problems
Figure 4 for Solving robust MDPs as a sequence of static RL problems
Viaarxiv icon

Imitating Language via Scalable Inverse Reinforcement Learning

Add code
Sep 02, 2024
Figure 1 for Imitating Language via Scalable Inverse Reinforcement Learning
Figure 2 for Imitating Language via Scalable Inverse Reinforcement Learning
Figure 3 for Imitating Language via Scalable Inverse Reinforcement Learning
Figure 4 for Imitating Language via Scalable Inverse Reinforcement Learning
Viaarxiv icon

Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion

Add code
Jun 27, 2024
Viaarxiv icon

Averaging log-likelihoods in direct alignment

Add code
Jun 27, 2024
Viaarxiv icon

Time-Constrained Robust MDPs

Add code
Jun 12, 2024
Viaarxiv icon

RRLS : Robust Reinforcement Learning Suite

Add code
Jun 12, 2024
Viaarxiv icon

Bootstrapping Expectiles in Reinforcement Learning

Add code
Jun 06, 2024
Viaarxiv icon

Self-Improving Robust Preference Optimization

Add code
Jun 03, 2024
Viaarxiv icon

Leveraging Procedural Generation for Learning Autonomous Peg-in-Hole Assembly in Space

Add code
May 02, 2024
Viaarxiv icon