Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Emma Caldwell

Gaussian Stochastic Weight Averaging for Bayesian Low-Rank Adaptation of Large Language Models

May 06, 2024

Emre Onal, Klemens Flöge, Emma Caldwell, Arsen Sheverdin, Vincent Fortuin

Figure 1 for Gaussian Stochastic Weight Averaging for Bayesian Low-Rank Adaptation of Large Language Models

Figure 2 for Gaussian Stochastic Weight Averaging for Bayesian Low-Rank Adaptation of Large Language Models

Figure 3 for Gaussian Stochastic Weight Averaging for Bayesian Low-Rank Adaptation of Large Language Models

Abstract:Fine-tuned Large Language Models (LLMs) often suffer from overconfidence and poor calibration, particularly when fine-tuned on small datasets. To address these challenges, we propose a simple combination of Low-Rank Adaptation (LoRA) with Gaussian Stochastic Weight Averaging (SWAG), facilitating approximate Bayesian inference in LLMs. Through extensive testing across several Natural Language Processing (NLP) benchmarks, we demonstrate that our straightforward and computationally efficient approach improves model generalization and calibration. We further show that our method exhibits greater robustness against distribution shift, as reflected in its performance on out-of-distribution tasks.

* 14 pages, 1 figure, 2 tables

Via

Access Paper or Ask Questions