Picture for Xuandong Zhao

Xuandong Zhao

Assessing Judging Bias in Large Reasoning Models: An Empirical Study

Add code
Apr 14, 2025
Viaarxiv icon

Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs

Add code
Apr 07, 2025
Viaarxiv icon

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

Add code
Mar 19, 2025
Viaarxiv icon

Improving LLM Safety Alignment with Dual-Objective Optimization

Add code
Mar 05, 2025
Viaarxiv icon

Reward Shaping to Mitigate Reward Hacking in RLHF

Add code
Feb 26, 2025
Viaarxiv icon

DIS-CO: Discovering Copyrighted Content in VLMs Training Data

Add code
Feb 25, 2025
Viaarxiv icon

Scalable Best-of-N Selection for Large Language Models via Self-Certainty

Add code
Feb 25, 2025
Viaarxiv icon

The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1

Add code
Feb 18, 2025
Viaarxiv icon

PrivAgent: Agentic-based Red-teaming for LLM Privacy Leakage

Add code
Dec 07, 2024
Viaarxiv icon

A Practical Examination of AI-Generated Text Detectors for Large Language Models

Add code
Dec 06, 2024
Figure 1 for A Practical Examination of AI-Generated Text Detectors for Large Language Models
Figure 2 for A Practical Examination of AI-Generated Text Detectors for Large Language Models
Figure 3 for A Practical Examination of AI-Generated Text Detectors for Large Language Models
Figure 4 for A Practical Examination of AI-Generated Text Detectors for Large Language Models
Viaarxiv icon