Picture for Shiyao Cui

Shiyao Cui

SynCred-Bench: Benchmarking Synthetic Credibility in AI-Generated Visual Misinformation

Add code
Jun 02, 2026
Viaarxiv icon

RUBAS: Rubric-Based Reinforcement Learning for Agent Safety

Add code
Jun 02, 2026
Viaarxiv icon

LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety

Add code
Apr 13, 2026
Viaarxiv icon

SMSP: A Plug-and-Play Strategy of Multi-Scale Perception for MLLMs to Perceive Visual Illusions

Add code
Mar 24, 2026
Viaarxiv icon

Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure

Add code
Mar 05, 2026
Viaarxiv icon

The Missing Half: Unveiling Training-time Implicit Safety Risks Beyond Deployment

Add code
Feb 04, 2026
Viaarxiv icon

The Side Effects of Being Smart: Safety Risks in MLLMs' Multi-Image Reasoning

Add code
Jan 20, 2026
Viaarxiv icon

JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering

Add code
Aug 07, 2025
Viaarxiv icon

Exploring Multimodal Challenges in Toxic Chinese Detection: Taxonomy, Benchmark, and Findings

Add code
May 30, 2025
Viaarxiv icon

Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!

Add code
May 21, 2025
Viaarxiv icon