Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging

Jun 29, 2023

Max Zimmer, Christoph Spiegel, Sebastian Pokutta

Figure 1 for Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging

Figure 2 for Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging

Figure 3 for Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging

Figure 4 for Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging

Share this with someone who'll enjoy it:

Abstract:Neural networks can be significantly compressed by pruning, leading to sparse models requiring considerably less storage and floating-point operations while maintaining predictive performance. Model soups (Wortsman et al., 2022) improve generalization and out-of-distribution performance by averaging the parameters of multiple models into a single one without increased inference time. However, identifying models in the same loss basin to leverage both sparsity and parameter averaging is challenging, as averaging arbitrary sparse models reduces the overall sparsity due to differing sparse connectivities. In this work, we address these challenges by demonstrating that exploring a single retraining phase of Iterative Magnitude Pruning (IMP) with varying hyperparameter configurations, such as batch ordering or weight decay, produces models that are suitable for averaging and share the same sparse connectivity by design. Averaging these models significantly enhances generalization performance compared to their individual components. Building on this idea, we introduce Sparse Model Soups (SMS), a novel method for merging sparse models by initiating each prune-retrain cycle with the averaged model of the previous phase. SMS maintains sparsity, exploits sparse network benefits being modular and fully parallelizable, and substantially improves IMP's performance. Additionally, we demonstrate that SMS can be adapted to enhance the performance of state-of-the-art pruning during training approaches.

* 9 pages, 5 pages references, 7 pages appendix

View paper on

Share this with someone who'll enjoy it:

Title:Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging

Paper and Code