Picture for Brendan Murphy

Brendan Murphy

Targeted Manipulation and Deception Emerge when Optimizing LLMs for User Feedback

Add code
Nov 04, 2024
Viaarxiv icon

Scaling Laws for Data Poisoning in LLMs

Add code
Aug 06, 2024
Viaarxiv icon