Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Stochastic Anderson Mixing for Nonconvex Stochastic Optimization

Oct 04, 2021

Fuchao Wei, Chenglong Bao, Yang Liu

Figure 1 for Stochastic Anderson Mixing for Nonconvex Stochastic Optimization

Figure 2 for Stochastic Anderson Mixing for Nonconvex Stochastic Optimization

Figure 3 for Stochastic Anderson Mixing for Nonconvex Stochastic Optimization

Figure 4 for Stochastic Anderson Mixing for Nonconvex Stochastic Optimization

Share this with someone who'll enjoy it:

Abstract:Anderson mixing (AM) is an acceleration method for fixed-point iterations. Despite its success and wide usage in scientific computing, the convergence theory of AM remains unclear, and its applications to machine learning problems are not well explored. In this paper, by introducing damped projection and adaptive regularization to classical AM, we propose a Stochastic Anderson Mixing (SAM) scheme to solve nonconvex stochastic optimization problems. Under mild assumptions, we establish the convergence theory of SAM, including the almost sure convergence to stationary points and the worst-case iteration complexity. Moreover, the complexity bound can be improved when randomly choosing an iterate as the output. To further accelerate the convergence, we incorporate a variance reduction technique into the proposed SAM. We also propose a preconditioned mixing strategy for SAM which can empirically achieve faster convergence or better generalization ability. Finally, we apply the SAM method to train various neural networks including the vanilla CNN, ResNets, WideResNet, ResNeXt, DenseNet and RNN. Experimental results on image classification and language model demonstrate the advantages of our method.

* Accepted by the 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Stochastic Anderson Mixing for Nonconvex Stochastic Optimization

Paper and Code