Picture for Meriem Boubdir

Meriem Boubdir

Elo Uncovered: Robustness and Best Practices in Language Model Evaluation

Add code
Nov 29, 2023
Figure 1 for Elo Uncovered: Robustness and Best Practices in Language Model Evaluation
Figure 2 for Elo Uncovered: Robustness and Best Practices in Language Model Evaluation
Figure 3 for Elo Uncovered: Robustness and Best Practices in Language Model Evaluation
Figure 4 for Elo Uncovered: Robustness and Best Practices in Language Model Evaluation
Viaarxiv icon

Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation

Add code
Oct 22, 2023
Viaarxiv icon