Picture for Niloofar Mireshghallah

Niloofar Mireshghallah

Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models

Add code
Mar 15, 2025
Viaarxiv icon

Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training

Add code
Feb 21, 2025
Viaarxiv icon

Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection

Add code
Jan 20, 2025
Viaarxiv icon

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice

Add code
Dec 09, 2024
Figure 1 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 2 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 3 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 4 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Viaarxiv icon

AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text

Add code
Oct 05, 2024
Viaarxiv icon

HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions

Add code
Sep 26, 2024
Figure 1 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Figure 2 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Figure 3 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Figure 4 for HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Viaarxiv icon

Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild

Add code
Jul 16, 2024
Figure 1 for Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild
Figure 2 for Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild
Figure 3 for Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild
Figure 4 for Trust No Bot: Discovering Personal Disclosures in Human-LLM Conversations in the Wild
Viaarxiv icon

CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation

Add code
Jul 09, 2024
Viaarxiv icon

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

Add code
Jun 26, 2024
Figure 1 for WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
Figure 2 for WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
Figure 3 for WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
Figure 4 for WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
Viaarxiv icon

Breaking News: Case Studies of Generative AI's Use in Journalism

Add code
Jun 19, 2024
Figure 1 for Breaking News: Case Studies of Generative AI's Use in Journalism
Figure 2 for Breaking News: Case Studies of Generative AI's Use in Journalism
Figure 3 for Breaking News: Case Studies of Generative AI's Use in Journalism
Figure 4 for Breaking News: Case Studies of Generative AI's Use in Journalism
Viaarxiv icon