Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Mitigating Frequency Bias and Anisotropy in Language Model Pre-Training with Syntactic Smoothing

Oct 15, 2024

Richard Diehl Martinez, Zebulon Goriely, Andrew Caines, Paula Buttery, Lisa Beinborn

Figure 1 for Mitigating Frequency Bias and Anisotropy in Language Model Pre-Training with Syntactic Smoothing

Figure 2 for Mitigating Frequency Bias and Anisotropy in Language Model Pre-Training with Syntactic Smoothing

Figure 3 for Mitigating Frequency Bias and Anisotropy in Language Model Pre-Training with Syntactic Smoothing

Figure 4 for Mitigating Frequency Bias and Anisotropy in Language Model Pre-Training with Syntactic Smoothing

Share this with someone who'll enjoy it:

Abstract:Language models strongly rely on frequency information because they maximize the likelihood of tokens during pre-training. As a consequence, language models tend to not generalize well to tokens that are seldom seen during training. Moreover, maximum likelihood training has been discovered to give rise to anisotropy: representations of tokens in a model tend to cluster tightly in a high-dimensional cone, rather than spreading out over their representational capacity. Our work introduces a method for quantifying the frequency bias of a language model by assessing sentence-level perplexity with respect to token-level frequency. We then present a method for reducing the frequency bias of a language model by inducing a syntactic prior over token representations during pre-training. Our Syntactic Smoothing method adjusts the maximum likelihood objective function to distribute the learning signal to syntactically similar tokens. This approach results in better performance on infrequent English tokens and a decrease in anisotropy. We empirically show that the degree of anisotropy in a model correlates with its frequency bias.

View paper on

Share this with someone who'll enjoy it:

Title:Mitigating Frequency Bias and Anisotropy in Language Model Pre-Training with Syntactic Smoothing

Paper and Code