Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Boxiang Yang

Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need

Dec 10, 2024

Bo-Wen Zhang, Yan Yan, Boxiang Yang, Yifei Xue, Guang Liu

Figure 1 for Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need

Figure 2 for Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need

Figure 3 for Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need

Figure 4 for Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need

Abstract:While scaling laws optimize training configurations for large language models (LLMs) through experiments on smaller or early-stage models, they fail to predict emergent abilities due to the absence of such capabilities in these models. To address this, we propose a method that predicts emergent abilities by leveraging proxy tasks. We begin by establishing relevance metrics between the target task and candidate tasks based on performance differences across multiple models. These candidate tasks are then validated for robustness with small model ensembles, leading to the selection of the most appropriate proxy tasks. The predicted performance on the target task is then derived by integrating the evaluation results of these proxies. In a case study on tool utilization capabilities, our method demonstrated a strong correlation between predicted and actual performance, confirming its effectiveness.

Via

Access Paper or Ask Questions