Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xingyu Qu

Rethink Model Re-Basin and the Linear Mode Connectivity

Feb 05, 2024

Xingyu Qu, Samuel Horvath

Abstract:Recent studies suggest that with sufficiently wide models, most SGD solutions can, up to permutation, converge into the same basin. This phenomenon, known as the model re-basin regime, has significant implications for model averaging. However, current re-basin strategies are limited in effectiveness due to a lack of comprehensive understanding of underlying mechanisms. Addressing this gap, our work revisits standard practices and uncovers the frequent inadequacies of existing matching algorithms, which we show can be mitigated through proper re-normalization. By introducing a more direct analytical approach, we expose the interaction between matching algorithms and re-normalization processes. This perspective not only clarifies and refines previous findings but also facilitates novel insights. For instance, it connects the linear mode connectivity to pruning, motivating a lightweight yet effective post-pruning plug-in that can be directly merged with any existing pruning techniques. Our implementation is available at https://github.com/XingyuQu/rethink-re-basin.

* 40 pages

Via

Access Paper or Ask Questions

GAGA: Deciphering Age-path of Generalized Self-paced Regularizer

Sep 23, 2022

Xingyu Qu, Diyang Li, Xiaohan Zhao, Bin Gu

Figure 1 for GAGA: Deciphering Age-path of Generalized Self-paced Regularizer

Figure 2 for GAGA: Deciphering Age-path of Generalized Self-paced Regularizer

Figure 3 for GAGA: Deciphering Age-path of Generalized Self-paced Regularizer

Figure 4 for GAGA: Deciphering Age-path of Generalized Self-paced Regularizer

Abstract:Nowadays self-paced learning (SPL) is an important machine learning paradigm that mimics the cognitive process of humans and animals. The SPL regime involves a self-paced regularizer and a gradually increasing age parameter, which plays a key role in SPL but where to optimally terminate this process is still non-trivial to determine. A natural idea is to compute the solution path w.r.t. age parameter (i.e., age-path). However, current age-path algorithms are either limited to the simplest regularizer, or lack solid theoretical understanding as well as computational efficiency. To address this challenge, we propose a novel \underline{G}eneralized \underline{Ag}e-path \underline{A}lgorithm (GAGA) for SPL with various self-paced regularizers based on ordinary differential equations (ODEs) and sets control, which can learn the entire solution spectrum w.r.t. a range of age parameters. To the best of our knowledge, GAGA is the first exact path-following algorithm tackling the age-path for general self-paced regularizer. Finally the algorithmic steps of classic SVM and Lasso are described in detail. We demonstrate the performance of GAGA on real-world datasets, and find considerable speedup between our algorithm and competing baselines.

* 33 pages. Published as a conference paper at NeurIPS 2022

Via

Access Paper or Ask Questions