Picture for Eric Mitchell

Eric Mitchell

Test-Time Alignment via Hypothesis Reweighting

Add code
Dec 11, 2024
Viaarxiv icon

Calibrating Language Models with Adaptive Temperature Scaling

Add code
Sep 29, 2024
Viaarxiv icon

Online Adaptation of Language Models with a Memory of Amortized Contexts

Add code
Mar 07, 2024
Viaarxiv icon

A Critical Evaluation of AI Feedback for Aligning Large Language Models

Add code
Feb 19, 2024
Figure 1 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Figure 2 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Figure 3 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Figure 4 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Viaarxiv icon

RLVF: Learning from Verbal Feedback without Overgeneralization

Add code
Feb 16, 2024
Viaarxiv icon

Fine-tuning Language Models for Factuality

Add code
Nov 14, 2023
Viaarxiv icon

An Emulator for Fine-Tuning Large Language Models using Small Language Models

Add code
Oct 19, 2023
Viaarxiv icon

Identifying and Mitigating the Security Risks of Generative AI

Add code
Aug 28, 2023
Figure 1 for Identifying and Mitigating the Security Risks of Generative AI
Viaarxiv icon

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Add code
May 29, 2023
Viaarxiv icon

Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback

Add code
May 24, 2023
Viaarxiv icon