Picture for Xiaoqi Jian

Xiaoqi Jian

Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance

Add code
Feb 18, 2025
Viaarxiv icon