Picture for Xudong Pan

Xudong Pan

WebTrap Park: An Automated Platform for Systematic Security Evaluation of Web Agents

Add code
Jan 13, 2026
Viaarxiv icon

When Bots Take the Bait: Exposing and Mitigating the Emerging Social Engineering Attack in Web Automation Agent

Add code
Jan 12, 2026
Viaarxiv icon

Evaluation Faking: Unveiling Observer Effects in Safety Evaluation of Frontier AI Systems

Add code
May 23, 2025
Viaarxiv icon

ReasoningShield: Content Safety Detection over Reasoning Traces of Large Reasoning Models

Add code
May 22, 2025
Viaarxiv icon

Think Twice Before You Act: Enhancing Agent Behavioral Safety with Thought Correction

Add code
May 19, 2025
Viaarxiv icon

OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation

Add code
Apr 18, 2025
Viaarxiv icon

StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models

Add code
Apr 14, 2025
Viaarxiv icon

Frontier AI systems have surpassed the self-replicating red line

Add code
Dec 09, 2024
Figure 1 for Frontier AI systems have surpassed the self-replicating red line
Figure 2 for Frontier AI systems have surpassed the self-replicating red line
Figure 3 for Frontier AI systems have surpassed the self-replicating red line
Figure 4 for Frontier AI systems have surpassed the self-replicating red line
Viaarxiv icon

No-Skim: Towards Efficiency Robustness Evaluation on Skimming-based Language Models

Add code
Dec 18, 2023
Viaarxiv icon

BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting

Add code
Dec 08, 2023
Figure 1 for BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting
Figure 2 for BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting
Figure 3 for BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting
Figure 4 for BELT: Old-School Backdoor Attacks can Evade the State-of-the-Art Defense with Backdoor Exclusivity Lifting
Viaarxiv icon