Picture for Geoffrey Cideron

Geoffrey Cideron

Diversity-Rewarded CFG Distillation

Add code
Oct 08, 2024
Viaarxiv icon

Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning

Add code
Jul 22, 2024
Figure 1 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Figure 2 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Figure 3 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Figure 4 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Viaarxiv icon

BOND: Aligning LLMs with Best-of-N Distillation

Add code
Jul 19, 2024
Figure 1 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 2 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 3 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 4 for BOND: Aligning LLMs with Best-of-N Distillation
Viaarxiv icon

MusicRL: Aligning Music Generation to Human Preferences

Add code
Feb 06, 2024
Viaarxiv icon

WARM: On the Benefits of Weight Averaged Reward Models

Add code
Jan 22, 2024
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

Add code
May 31, 2023
Figure 1 for Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Figure 2 for Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Figure 3 for Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Figure 4 for Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Viaarxiv icon

Get Back Here: Robust Imitation by Return-to-Distribution Planning

Add code
May 02, 2023
Viaarxiv icon

vec2text with Round-Trip Translations

Add code
Sep 14, 2022
Figure 1 for vec2text with Round-Trip Translations
Figure 2 for vec2text with Round-Trip Translations
Figure 3 for vec2text with Round-Trip Translations
Figure 4 for vec2text with Round-Trip Translations
Viaarxiv icon

QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning

Add code
Jun 15, 2020
Figure 1 for QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning
Figure 2 for QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning
Figure 3 for QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning
Figure 4 for QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning
Viaarxiv icon