Picture for Milad Nasr

Milad Nasr

Evaluating the Robustness of a Production Malware Detection System to Transferable Adversarial Attacks

Add code
Oct 02, 2025
Viaarxiv icon

Cascading Adversarial Bias from Injection to Distillation in Language Models

Add code
May 30, 2025
Viaarxiv icon

Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models

Add code
May 24, 2025
Viaarxiv icon

Lessons from Defending Gemini Against Indirect Prompt Injections

Add code
May 20, 2025
Viaarxiv icon

LLMs unlock new paths to monetizing exploits

Add code
May 16, 2025
Viaarxiv icon

Privacy Auditing of Large Language Models

Add code
Mar 09, 2025
Viaarxiv icon

AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defenses

Add code
Mar 03, 2025
Viaarxiv icon

Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards

Add code
Jan 13, 2025
Figure 1 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 2 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 3 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 4 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Viaarxiv icon

On Evaluating the Durability of Safeguards for Open-Weight LLMs

Add code
Dec 10, 2024
Viaarxiv icon

SoK: Watermarking for AI-Generated Content

Add code
Nov 27, 2024
Figure 1 for SoK: Watermarking for AI-Generated Content
Figure 2 for SoK: Watermarking for AI-Generated Content
Viaarxiv icon