Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models

Aug 29, 2024

Yi-Lin Tuan, William Yang Wang

Figure 1 for A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models

Figure 2 for A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models

Figure 3 for A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models

Figure 4 for A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models

Share this with someone who'll enjoy it:

Abstract:Beyond maximum likelihood estimation (MLE), the standard objective of a language model (LM) that optimizes good examples probabilities, many studies have explored ways that also penalize bad examples for enhancing the quality of output distribution, including unlikelihood training, exponential maximizing average treatment effect (ExMATE), and direct preference optimization (DPO). To systematically compare these methods and further provide a unified recipe for LM optimization, in this paper, we present a unique angle of gradient analysis of loss functions that simultaneously reward good examples and penalize bad ones in LMs. Through both mathematical results and experiments on CausalDialogue and Anthropic HH-RLHF datasets, we identify distinct functional characteristics among these methods. We find that ExMATE serves as a superior surrogate for MLE, and that combining DPO with ExMATE instead of MLE further enhances both the statistical (5-7%) and generative (+18% win rate) performance.

View paper on

Share this with someone who'll enjoy it:

Title:A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models

Paper and Code