Picture for Dongbai Li

Dongbai Li

The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination

Add code
Mar 20, 2025
Viaarxiv icon

Sample Weight Averaging for Stable Prediction

Add code
Feb 11, 2025
Viaarxiv icon