Picture for Xiangzheng Zhang

Xiangzheng Zhang

Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance

Add code
Feb 18, 2025
Viaarxiv icon

Expand VSR Benchmark for VLLM to Expertize in Spatial Rules

Add code
Dec 24, 2024
Viaarxiv icon