Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Stronger Random Baselines for In-Context Learning

Apr 19, 2024

Gregory Yauney, David Mimno

Figure 1 for Stronger Random Baselines for In-Context Learning

Figure 2 for Stronger Random Baselines for In-Context Learning

Figure 3 for Stronger Random Baselines for In-Context Learning

Figure 4 for Stronger Random Baselines for In-Context Learning

Share this with someone who'll enjoy it:

Abstract:Evaluating the in-context learning classification performance of language models poses challenges due to small dataset sizes, extensive prompt-selection using the validation set, and intentionally difficult tasks that lead to near-random performance. The standard random baseline -- the expected accuracy of guessing labels uniformly at random -- is stable when the evaluation set is used only once or when the dataset is large. We account for the common practice of validation set reuse and existing small datasets with a stronger random baseline: the expected maximum accuracy across multiple random classifiers. When choosing the best prompt demonstrations across six quantized language models applied to 16 BIG-bench Lite tasks, more than 20\% of the few-shot results that exceed the standard baseline do not exceed this stronger random baseline. When held-out test sets are available, this stronger baseline is also a better predictor of held-out performance than the standard baseline, avoiding unnecessary test set evaluations. This maximum random baseline provides an easily calculated drop-in replacement for the standard baseline.

View paper on

Share this with someone who'll enjoy it:

Title:Stronger Random Baselines for In-Context Learning

Paper and Code