Picture for Youzhi Wang

Youzhi Wang

VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation

Add code
Jun 26, 2024
Viaarxiv icon