Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Self-Improvement in Language Models: The Sharpening Mechanism

Dec 02, 2024

Audrey Huang, Adam Block, Dylan J. Foster, Dhruv Rohatgi, Cyril Zhang, Max Simchowitz, Jordan T. Ash, Akshay Krishnamurthy

Figure 1 for Self-Improvement in Language Models: The Sharpening Mechanism

Figure 2 for Self-Improvement in Language Models: The Sharpening Mechanism

Figure 3 for Self-Improvement in Language Models: The Sharpening Mechanism

Figure 4 for Self-Improvement in Language Models: The Sharpening Mechanism

Share this with someone who'll enjoy it:

Abstract:Recent work in language modeling has raised the possibility of self-improvement, where a language models evaluates and refines its own generations to achieve higher performance without external feedback. It is impossible for this self-improvement to create information that is not already in the model, so why should we expect that this will lead to improved capabilities? We offer a new perspective on the capabilities of self-improvement through a lens we refer to as sharpening. Motivated by the observation that language models are often better at verifying response quality than they are at generating correct responses, we formalize self-improvement as using the model itself as a verifier during post-training in order to ``sharpen'' the model to one placing large mass on high-quality sequences, thereby amortizing the expensive inference-time computation of generating good sequences. We begin by introducing a new statistical framework for sharpening in which the learner aims to sharpen a pre-trained base policy via sample access, and establish fundamental limits. Then we analyze two natural families of self-improvement algorithms based on SFT and RLHF.

View paper on

Share this with someone who'll enjoy it:

Title:Self-Improvement in Language Models: The Sharpening Mechanism

Paper and Code