Picture for Zachary Coalson

Zachary Coalson

Fail-Closed Alignment for Large Language Models

Add code
Feb 19, 2026
Viaarxiv icon

Discovering Universal Activation Directions for PII Leakage in Language Models

Add code
Feb 19, 2026
Viaarxiv icon

PrisonBreak: Jailbreaking Large Language Models with Fewer Than Twenty-Five Targeted Bit-flips

Add code
Dec 10, 2024
Figure 1 for PrisonBreak: Jailbreaking Large Language Models with Fewer Than Twenty-Five Targeted Bit-flips
Figure 2 for PrisonBreak: Jailbreaking Large Language Models with Fewer Than Twenty-Five Targeted Bit-flips
Figure 3 for PrisonBreak: Jailbreaking Large Language Models with Fewer Than Twenty-Five Targeted Bit-flips
Figure 4 for PrisonBreak: Jailbreaking Large Language Models with Fewer Than Twenty-Five Targeted Bit-flips
Viaarxiv icon

Hard Work Does Not Always Pay Off: Poisoning Attacks on Neural Architecture Search

Add code
May 09, 2024
Figure 1 for Hard Work Does Not Always Pay Off: Poisoning Attacks on Neural Architecture Search
Figure 2 for Hard Work Does Not Always Pay Off: Poisoning Attacks on Neural Architecture Search
Figure 3 for Hard Work Does Not Always Pay Off: Poisoning Attacks on Neural Architecture Search
Figure 4 for Hard Work Does Not Always Pay Off: Poisoning Attacks on Neural Architecture Search
Viaarxiv icon

BERT Lost Patience Won't Be Robust to Adversarial Slowdown

Add code
Oct 31, 2023
Viaarxiv icon