Picture for Nikola Momchev

Nikola Momchev

Imitating Language via Scalable Inverse Reinforcement Learning

Add code
Sep 02, 2024
Figure 1 for Imitating Language via Scalable Inverse Reinforcement Learning
Figure 2 for Imitating Language via Scalable Inverse Reinforcement Learning
Figure 3 for Imitating Language via Scalable Inverse Reinforcement Learning
Figure 4 for Imitating Language via Scalable Inverse Reinforcement Learning
Viaarxiv icon

Gemma 2: Improving Open Language Models at a Practical Size

Add code
Aug 02, 2024
Figure 1 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 2 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 3 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 4 for Gemma 2: Improving Open Language Models at a Practical Size
Viaarxiv icon

BOND: Aligning LLMs with Best-of-N Distillation

Add code
Jul 19, 2024
Figure 1 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 2 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 3 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 4 for BOND: Aligning LLMs with Best-of-N Distillation
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Nash Learning from Human Feedback

Add code
Dec 06, 2023
Viaarxiv icon

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

Add code
May 31, 2023
Viaarxiv icon

RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning

Add code
Nov 04, 2021
Figure 1 for RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning
Figure 2 for RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning
Figure 3 for RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning
Figure 4 for RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning
Viaarxiv icon

Hyperparameter Selection for Imitation Learning

Add code
May 25, 2021
Figure 1 for Hyperparameter Selection for Imitation Learning
Figure 2 for Hyperparameter Selection for Imitation Learning
Figure 3 for Hyperparameter Selection for Imitation Learning
Figure 4 for Hyperparameter Selection for Imitation Learning
Viaarxiv icon

*-CFQ: Analyzing the Scalability of Machine Learning on a Compositional Task

Add code
Dec 15, 2020
Figure 1 for *-CFQ: Analyzing the Scalability of Machine Learning on a Compositional Task
Figure 2 for *-CFQ: Analyzing the Scalability of Machine Learning on a Compositional Task
Figure 3 for *-CFQ: Analyzing the Scalability of Machine Learning on a Compositional Task
Figure 4 for *-CFQ: Analyzing the Scalability of Machine Learning on a Compositional Task
Viaarxiv icon

Measuring Compositional Generalization: A Comprehensive Method on Realistic Data

Add code
Dec 20, 2019
Figure 1 for Measuring Compositional Generalization: A Comprehensive Method on Realistic Data
Figure 2 for Measuring Compositional Generalization: A Comprehensive Method on Realistic Data
Figure 3 for Measuring Compositional Generalization: A Comprehensive Method on Realistic Data
Figure 4 for Measuring Compositional Generalization: A Comprehensive Method on Realistic Data
Viaarxiv icon