Picture for Geoffrey Cideron

Geoffrey Cideron

Diversity-Rewarded CFG Distillation

Add code
Oct 08, 2024
Viaarxiv icon

Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning

Add code
Jul 22, 2024
Viaarxiv icon

BOND: Aligning LLMs with Best-of-N Distillation

Add code
Jul 19, 2024
Figure 1 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 2 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 3 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 4 for BOND: Aligning LLMs with Best-of-N Distillation
Viaarxiv icon

MusicRL: Aligning Music Generation to Human Preferences

Add code
Feb 06, 2024
Viaarxiv icon

WARM: On the Benefits of Weight Averaged Reward Models

Add code
Jan 22, 2024
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

Add code
May 31, 2023
Viaarxiv icon

Get Back Here: Robust Imitation by Return-to-Distribution Planning

Add code
May 02, 2023
Viaarxiv icon

vec2text with Round-Trip Translations

Add code
Sep 14, 2022
Figure 1 for vec2text with Round-Trip Translations
Figure 2 for vec2text with Round-Trip Translations
Figure 3 for vec2text with Round-Trip Translations
Figure 4 for vec2text with Round-Trip Translations
Viaarxiv icon

QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning

Add code
Jun 15, 2020
Figure 1 for QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning
Figure 2 for QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning
Figure 3 for QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning
Figure 4 for QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning
Viaarxiv icon