Picture for Erik Jones

Erik Jones

Best-of-N Jailbreaking

Add code
Dec 04, 2024
Viaarxiv icon

Adversaries Can Misuse Combinations of Safe Models

Add code
Jun 20, 2024
Viaarxiv icon

Feedback Loops With Language Models Drive In-Context Reward Hacking

Add code
Feb 09, 2024
Viaarxiv icon

Orca 2: Teaching Small Language Models How to Reason

Add code
Nov 21, 2023
Viaarxiv icon

Teaching Language Models to Hallucinate Less with Synthetic Tasks

Add code
Oct 10, 2023
Figure 1 for Teaching Language Models to Hallucinate Less with Synthetic Tasks
Figure 2 for Teaching Language Models to Hallucinate Less with Synthetic Tasks
Figure 3 for Teaching Language Models to Hallucinate Less with Synthetic Tasks
Figure 4 for Teaching Language Models to Hallucinate Less with Synthetic Tasks
Viaarxiv icon

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

Add code
Sep 26, 2023
Viaarxiv icon

Mass-Producing Failures of Multimodal Systems with Language Models

Add code
Jun 21, 2023
Figure 1 for Mass-Producing Failures of Multimodal Systems with Language Models
Figure 2 for Mass-Producing Failures of Multimodal Systems with Language Models
Figure 3 for Mass-Producing Failures of Multimodal Systems with Language Models
Figure 4 for Mass-Producing Failures of Multimodal Systems with Language Models
Viaarxiv icon

Automatically Auditing Large Language Models via Discrete Optimization

Add code
Mar 08, 2023
Viaarxiv icon

Capturing Failures of Large Language Models via Human Cognitive Biases

Add code
Feb 24, 2022
Figure 1 for Capturing Failures of Large Language Models via Human Cognitive Biases
Figure 2 for Capturing Failures of Large Language Models via Human Cognitive Biases
Figure 3 for Capturing Failures of Large Language Models via Human Cognitive Biases
Figure 4 for Capturing Failures of Large Language Models via Human Cognitive Biases
Viaarxiv icon

Selective Classification Can Magnify Disparities Across Groups

Add code
Oct 27, 2020
Figure 1 for Selective Classification Can Magnify Disparities Across Groups
Figure 2 for Selective Classification Can Magnify Disparities Across Groups
Figure 3 for Selective Classification Can Magnify Disparities Across Groups
Figure 4 for Selective Classification Can Magnify Disparities Across Groups
Viaarxiv icon