Picture for Claudia Tang

Claudia Tang

VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation

Add code
Jun 26, 2024
Viaarxiv icon