Picture for Xinyue Shen

Xinyue Shen

The Art of (Mis)alignment: How Fine-Tuning Methods Effectively Misalign and Realign LLMs in Post-Training

Add code
Apr 09, 2026
Viaarxiv icon

OrgAgent: Organize Your Multi-Agent System like a Company

Add code
Apr 01, 2026
Viaarxiv icon

Real Money, Fake Models: Deceptive Model Claims in Shadow APIs

Add code
Mar 05, 2026
Viaarxiv icon

Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks

Add code
Mar 03, 2026
Viaarxiv icon

Spatiotemporal Calibration for Laser Vision Sensor in Hand-eye System Based on Straight-line Constraint

Add code
Sep 16, 2025
Viaarxiv icon

HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns

Add code
Jan 28, 2025
Viaarxiv icon

Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media

Add code
Dec 24, 2024
Viaarxiv icon

Voice Jailbreak Attacks Against GPT-4o

Add code
May 29, 2024
Viaarxiv icon

UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images

Add code
May 06, 2024
Figure 1 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Figure 2 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Figure 3 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Figure 4 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Viaarxiv icon

Comprehensive Assessment of Jailbreak Attacks Against LLMs

Add code
Feb 08, 2024
Figure 1 for Comprehensive Assessment of Jailbreak Attacks Against LLMs
Figure 2 for Comprehensive Assessment of Jailbreak Attacks Against LLMs
Figure 3 for Comprehensive Assessment of Jailbreak Attacks Against LLMs
Figure 4 for Comprehensive Assessment of Jailbreak Attacks Against LLMs
Viaarxiv icon