Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace

Oct 30, 2023

Chiyu Song, Zhanchao Zhou, Jianhao Yan, Yuejiao Fei, Zhenzhong Lan, Yue Zhang

Figure 1 for Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace

Figure 2 for Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace

Figure 3 for Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace

Figure 4 for Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace

Share this with someone who'll enjoy it:

Abstract:Instruction tuning is a burgeoning method to elicit the general intelligence of Large Language Models (LLMs). However, the creation of instruction data is still largely heuristic, leading to significant variation in quality and distribution across existing datasets. Experimental conclusions drawn from these datasets are also inconsistent, with some studies emphasizing the importance of scaling instruction numbers, while others argue that a limited number of samples suffice. To better understand data construction guidelines, we deepen our focus from the overall model performance to the growth of each underlying ability, such as creative writing, code generation, and logical reasoning. We systematically investigate the effects of data volume, parameter size, and data construction methods on the development of various abilities, using hundreds of model checkpoints (7b to 33b) fully instruction-tuned on a new collection of over 40k human-curated instruction data. This proposed dataset is stringently quality-controlled and categorized into ten distinct LLM abilities. Our study reveals three primary findings: (i) Despite data volume and parameter scale directly impacting models' overall performance, some abilities are more responsive to their increases and can be effectively trained using limited data, while some are highly resistant to these changes. (ii) Human-curated data strongly outperforms synthetic data from GPT-4 in efficiency and can constantly enhance model performance with volume increases, but is unachievable with synthetic data. (iii) Instruction data brings powerful cross-ability generalization, with evaluation results on out-of-domain data mirroring the first two observations. Furthermore, we demonstrate how these findings can guide more efficient data constructions, leading to practical performance improvements on public benchmarks.

View paper on

Share this with someone who'll enjoy it:

Title:Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace

Paper and Code