Picture for Kellin Pelrine

Kellin Pelrine

Large language models can effectively convince people to believe conspiracies

Add code
Jan 08, 2026
Viaarxiv icon

Emergent Persuasion: Will LLMs Persuade Without Being Prompted?

Add code
Dec 20, 2025
Viaarxiv icon

Jailbreak-Tuning: Models Efficiently Learn Jailbreak Susceptibility

Add code
Jul 15, 2025
Viaarxiv icon

Accidental Misalignment: Fine-Tuning Language Models Induces Unexpected Vulnerability

Add code
May 22, 2025
Figure 1 for Accidental Misalignment: Fine-Tuning Language Models Induces Unexpected Vulnerability
Figure 2 for Accidental Misalignment: Fine-Tuning Language Models Induces Unexpected Vulnerability
Figure 3 for Accidental Misalignment: Fine-Tuning Language Models Induces Unexpected Vulnerability
Figure 4 for Accidental Misalignment: Fine-Tuning Language Models Induces Unexpected Vulnerability
Viaarxiv icon

The Structural Safety Generalization Problem

Add code
Apr 13, 2025
Viaarxiv icon

From Intuition to Understanding: Using AI Peers to Overcome Physics Misconceptions

Add code
Apr 01, 2025
Viaarxiv icon

Epistemic Integrity in Large Language Models

Add code
Nov 10, 2024
Figure 1 for Epistemic Integrity in Large Language Models
Figure 2 for Epistemic Integrity in Large Language Models
Figure 3 for Epistemic Integrity in Large Language Models
Figure 4 for Epistemic Integrity in Large Language Models
Viaarxiv icon

A Guide to Misinformation Detection Datasets

Add code
Nov 07, 2024
Figure 1 for A Guide to Misinformation Detection Datasets
Figure 2 for A Guide to Misinformation Detection Datasets
Figure 3 for A Guide to Misinformation Detection Datasets
Figure 4 for A Guide to Misinformation Detection Datasets
Viaarxiv icon

A Simulation System Towards Solving Societal-Scale Manipulation

Add code
Oct 17, 2024
Figure 1 for A Simulation System Towards Solving Societal-Scale Manipulation
Figure 2 for A Simulation System Towards Solving Societal-Scale Manipulation
Figure 3 for A Simulation System Towards Solving Societal-Scale Manipulation
Viaarxiv icon

Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak Attacks

Add code
Aug 29, 2024
Figure 1 for Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak Attacks
Figure 2 for Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak Attacks
Figure 3 for Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak Attacks
Figure 4 for Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak Attacks
Viaarxiv icon