Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Adaptive Stochastic Weight Averaging

Jun 27, 2024

Caglar Demir, Arnab Sharma, Axel-Cyrille Ngonga Ngomo

Figure 1 for Adaptive Stochastic Weight Averaging

Figure 2 for Adaptive Stochastic Weight Averaging

Figure 3 for Adaptive Stochastic Weight Averaging

Figure 4 for Adaptive Stochastic Weight Averaging

Share this with someone who'll enjoy it:

Abstract:Ensemble models often improve generalization performances in challenging tasks. Yet, traditional techniques based on prediction averaging incur three well-known disadvantages: the computational overhead of training multiple models, increased latency, and memory requirements at test time. To address these issues, the Stochastic Weight Averaging (SWA) technique maintains a running average of model parameters from a specific epoch onward. Despite its potential benefits, maintaining a running average of parameters can hinder generalization, as an underlying running model begins to overfit. Conversely, an inadequately chosen starting point can render SWA more susceptible to underfitting compared to an underlying running model. In this work, we propose Adaptive Stochastic Weight Averaging (ASWA) technique that updates a running average of model parameters, only when generalization performance is improved on the validation dataset. Hence, ASWA can be seen as a combination of SWA with the early stopping technique, where the former accepts all updates on a parameter ensemble model and the latter rejects any update on an underlying running model. We conducted extensive experiments ranging from image classification to multi-hop reasoning over knowledge graphs. Our experiments over 11 benchmark datasets with 7 baseline models suggest that ASWA leads to a statistically better generalization across models and datasets

View paper on

Share this with someone who'll enjoy it:

Title:Adaptive Stochastic Weight Averaging

Paper and Code