Picture for Thomas Mesnard

Thomas Mesnard

PaliGemma 2: A Family of Versatile VLMs for Transfer

Add code
Dec 04, 2024
Viaarxiv icon

Gemma 2: Improving Open Language Models at a Practical Size

Add code
Aug 02, 2024
Figure 1 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 2 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 3 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 4 for Gemma 2: Improving Open Language Models at a Practical Size
Viaarxiv icon

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Add code
Apr 11, 2024
Figure 1 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Figure 2 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Figure 3 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Figure 4 for RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Viaarxiv icon

Gemma: Open Models Based on Gemini Research and Technology

Add code
Mar 13, 2024
Figure 1 for Gemma: Open Models Based on Gemini Research and Technology
Figure 2 for Gemma: Open Models Based on Gemini Research and Technology
Figure 3 for Gemma: Open Models Based on Gemini Research and Technology
Figure 4 for Gemma: Open Models Based on Gemini Research and Technology
Viaarxiv icon

Direct Language Model Alignment from Online AI Feedback

Add code
Feb 07, 2024
Viaarxiv icon

Nash Learning from Human Feedback

Add code
Dec 06, 2023
Viaarxiv icon

A Survey of Temporal Credit Assignment in Deep Reinforcement Learning

Add code
Dec 02, 2023
Viaarxiv icon

RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Add code
Sep 01, 2023
Figure 1 for RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Figure 2 for RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Figure 3 for RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Figure 4 for RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Viaarxiv icon

Curiosity in hindsight

Add code
Nov 18, 2022
Viaarxiv icon

Geometric Entropic Exploration

Add code
Jan 07, 2021
Figure 1 for Geometric Entropic Exploration
Figure 2 for Geometric Entropic Exploration
Figure 3 for Geometric Entropic Exploration
Figure 4 for Geometric Entropic Exploration
Viaarxiv icon