Picture for Nicholas Carlini

Nicholas Carlini

Dj

Measuring Non-Adversarial Reproduction of Training Data in Large Language Models

Add code
Nov 15, 2024
Viaarxiv icon

Stealing User Prompts from Mixture of Experts

Add code
Oct 30, 2024
Viaarxiv icon

Remote Timing Attacks on Efficient Language Model Inference

Add code
Oct 22, 2024
Figure 1 for Remote Timing Attacks on Efficient Language Model Inference
Figure 2 for Remote Timing Attacks on Efficient Language Model Inference
Figure 3 for Remote Timing Attacks on Efficient Language Model Inference
Figure 4 for Remote Timing Attacks on Efficient Language Model Inference
Viaarxiv icon

Persistent Pre-Training Poisoning of LLMs

Add code
Oct 17, 2024
Viaarxiv icon

Cutting through buggy adversarial example defenses: fixing 1 line of code breaks Sabre

Add code
May 06, 2024
Viaarxiv icon

Forcing Diffuse Distributions out of Language Models

Add code
Apr 16, 2024
Viaarxiv icon

Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models

Add code
Apr 01, 2024
Viaarxiv icon

Diffusion Denoising as a Certified Defense against Clean-label Poisoning

Add code
Mar 18, 2024
Figure 1 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Figure 2 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Figure 3 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Figure 4 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Viaarxiv icon

Query-Based Adversarial Prompt Generation

Add code
Feb 19, 2024
Viaarxiv icon

Initialization Matters for Adversarial Transfer Learning

Add code
Dec 10, 2023
Figure 1 for Initialization Matters for Adversarial Transfer Learning
Figure 2 for Initialization Matters for Adversarial Transfer Learning
Figure 3 for Initialization Matters for Adversarial Transfer Learning
Figure 4 for Initialization Matters for Adversarial Transfer Learning
Viaarxiv icon