Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Aug 25, 2020

Zhize Li, Hongyan Bao, Xiangliang Zhang, Peter Richtárik

Figure 1 for PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Figure 2 for PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Figure 3 for PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Figure 4 for PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Share this with someone who'll enjoy it:

Abstract:In this paper, we propose a novel stochastic gradient estimator---ProbAbilistic Gradient Estimator (PAGE)---for nonconvex optimization. PAGE is easy to implement as it is designed via a small adjustment to vanilla SGD: in each iteration, PAGE uses the vanilla minibatch SGD update with probability $p$ and reuses the previous gradient with a small adjustment, at a much lower computational cost, with probability $1-p$. We give a simple formula for the optimal choice of $p$. We prove tight lower bounds for nonconvex problems, which are of independent interest. Moreover, we prove matching upper bounds both in the finite-sum and online regimes, which establish that Page is an optimal method. Besides, we show that for nonconvex functions satisfying the Polyak-\L{}ojasiewicz (PL) condition, PAGE can automatically switch to a faster linear convergence rate. Finally, we conduct several deep learning experiments (e.g., LeNet, VGG, ResNet) on real datasets in PyTorch, and the results demonstrate that PAGE converges much faster than SGD in training and also achieves the higher test accuracy, validating our theoretical results and confirming the practical superiority of PAGE.

* 30 pages

View paper on

Share this with someone who'll enjoy it:

Title:PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Paper and Code