Picture for Nathan Lambert

Nathan Lambert

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Add code
Nov 22, 2024
Viaarxiv icon

M-RewardBench: Evaluating Reward Models in Multilingual Settings

Add code
Oct 20, 2024
Viaarxiv icon

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Add code
Sep 25, 2024
Figure 1 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Figure 2 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Figure 3 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Figure 4 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Viaarxiv icon

OLMoE: Open Mixture-of-Experts Language Models

Add code
Sep 03, 2024
Figure 1 for OLMoE: Open Mixture-of-Experts Language Models
Figure 2 for OLMoE: Open Mixture-of-Experts Language Models
Figure 3 for OLMoE: Open Mixture-of-Experts Language Models
Figure 4 for OLMoE: Open Mixture-of-Experts Language Models
Viaarxiv icon

Self-Directed Synthetic Dialogues and Revisions Technical Report

Add code
Jul 25, 2024
Viaarxiv icon

WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Add code
Jun 26, 2024
Figure 1 for WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
Figure 2 for WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
Figure 3 for WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
Figure 4 for WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
Viaarxiv icon

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Add code
Jun 13, 2024
Viaarxiv icon

D2PO: Discriminator-Guided DPO with Response Evaluation Models

Add code
May 02, 2024
Viaarxiv icon

Social Choice for AI Alignment: Dealing with Diverse Human Feedback

Add code
Apr 16, 2024
Figure 1 for Social Choice for AI Alignment: Dealing with Diverse Human Feedback
Figure 2 for Social Choice for AI Alignment: Dealing with Diverse Human Feedback
Figure 3 for Social Choice for AI Alignment: Dealing with Diverse Human Feedback
Figure 4 for Social Choice for AI Alignment: Dealing with Diverse Human Feedback
Viaarxiv icon

RewardBench: Evaluating Reward Models for Language Modeling

Add code
Mar 20, 2024
Viaarxiv icon