Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization

Jul 19, 2023

Nachuan Xiao, Xiaoyin Hu, Kim-Chuan Toh

Figure 1 for Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization

Figure 2 for Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization

Figure 3 for Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization

Figure 4 for Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization

Share this with someone who'll enjoy it:

Abstract:In this paper, we investigate the convergence properties of the stochastic gradient descent (SGD) method and its variants, especially in training neural networks built from nonsmooth activation functions. We develop a novel framework that assigns different timescales to stepsizes for updating the momentum terms and variables, respectively. Under mild conditions, we prove the global convergence of our proposed framework in both single-timescale and two-timescale cases. We show that our proposed framework encompasses a wide range of well-known SGD-type methods, including heavy-ball SGD, SignSGD, Lion, normalized SGD and clipped SGD. Furthermore, when the objective function adopts a finite-sum formulation, we prove the convergence properties for these SGD-type methods based on our proposed framework. In particular, we prove that these SGD-type methods find the Clarke stationary points of the objective function with randomly chosen stepsizes and initial points under mild assumptions. Preliminary numerical experiments demonstrate the high efficiency of our analyzed SGD-type methods.

* 30 pages

View paper on

Share this with someone who'll enjoy it:

Title:Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization

Paper and Code