Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Reliable, Reproducible, and Really Fast Leaderboards with Evalica

Dec 15, 2024

Dmitry Ustalov

Share this with someone who'll enjoy it:

Abstract:The rapid advancement of natural language processing (NLP) technologies, such as instruction-tuned large language models (LLMs), urges the development of modern evaluation protocols with human and machine feedback. We introduce Evalica, an open-source toolkit that facilitates the creation of reliable and reproducible model leaderboards. This paper presents its design, evaluates its performance, and demonstrates its usability through its Web interface, command-line interface, and Python API.

* accepted at COLING 2025 system demonstration track

View paper on

Share this with someone who'll enjoy it:

Title:Reliable, Reproducible, and Really Fast Leaderboards with Evalica

Paper and Code