Picture for Matthew Jagielski

Matthew Jagielski

On Evaluating the Durability of Safeguards for Open-Weight LLMs

Add code
Dec 10, 2024
Viaarxiv icon

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice

Add code
Dec 09, 2024
Viaarxiv icon

The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD

Add code
Oct 10, 2024
Viaarxiv icon

UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI

Add code
Jun 27, 2024
Viaarxiv icon

Beyond the Mean: Differentially Private Prototypes for Private Transfer Learning

Add code
Jun 12, 2024
Viaarxiv icon

Phantom: General Trigger Attacks on Retrieval Augmented Language Generation

Add code
May 30, 2024
Viaarxiv icon

Noise Masking Attacks and Defenses for Pretrained Speech Models

Add code
Apr 02, 2024
Viaarxiv icon

Scalable Extraction of Training Data from (Production) Language Models

Add code
Nov 28, 2023
Figure 1 for Scalable Extraction of Training Data from (Production) Language Models
Figure 2 for Scalable Extraction of Training Data from (Production) Language Models
Figure 3 for Scalable Extraction of Training Data from (Production) Language Models
Figure 4 for Scalable Extraction of Training Data from (Production) Language Models
Viaarxiv icon

Privacy Side Channels in Machine Learning Systems

Add code
Sep 11, 2023
Viaarxiv icon

Are aligned neural networks adversarially aligned?

Add code
Jun 26, 2023
Viaarxiv icon