Picture for Olivier Pietquin

Olivier Pietquin

NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics

Add code
Nov 11, 2024
Viaarxiv icon

Biodenoising: animal vocalization denoising without access to clean data

Add code
Oct 04, 2024
Viaarxiv icon

Averaging log-likelihoods in direct alignment

Add code
Jun 27, 2024
Viaarxiv icon

Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion

Add code
Jun 27, 2024
Viaarxiv icon

Countering Reward Over-optimization in LLM with Demonstration-Guided Reinforcement Learning

Add code
Apr 30, 2024
Viaarxiv icon

Language Evolution with Deep Learning

Add code
Mar 18, 2024
Viaarxiv icon

Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning

Add code
Mar 06, 2024
Viaarxiv icon

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Add code
Feb 26, 2024
Viaarxiv icon

MusicRL: Aligning Music Generation to Human Preferences

Add code
Feb 06, 2024
Viaarxiv icon

Learning Discrete-Time Major-Minor Mean Field Games

Add code
Dec 17, 2023
Viaarxiv icon