Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Gradient Normalization with(out) Clipping Ensures Convergence of Nonconvex SGD under Heavy-Tailed Noise with Improved Results

Oct 21, 2024

Tao Sun, Xinwang Liu, Kun Yuan

Figure 1 for Gradient Normalization with(out) Clipping Ensures Convergence of Nonconvex SGD under Heavy-Tailed Noise with Improved Results

Figure 2 for Gradient Normalization with(out) Clipping Ensures Convergence of Nonconvex SGD under Heavy-Tailed Noise with Improved Results

Share this with someone who'll enjoy it:

Abstract:This paper investigates Gradient Normalization Stochastic Gradient Descent without Clipping (NSGDC) and its variance reduction variant (NSGDC-VR) for nonconvex optimization under heavy-tailed noise. We present significant improvements in the theoretical results for both algorithms, including the removal of logarithmic factors from the convergence rates and the recovery of the convergence rate to match the deterministic case when the noise variance {\sigma} is zero. Additionally, we demonstrate that gradient normalization alone, assuming individual Lipschitz smoothness, is sufficient to ensure convergence of SGD under heavy-tailed noise, eliminating the need for gradient clipping. Furthermore, we introduce accelerated nonconvex algorithms that utilize second-order Lipschitz smoothness to achieve enhanced convergence rates in the presence of heavy-tailed noise. Our findings offer a deeper understanding of how gradient normalization and variance reduction techniques can be optimized for robust performance in challenging optimization scenarios.

View paper on

Share this with someone who'll enjoy it:

Title:Gradient Normalization with(out) Clipping Ensures Convergence of Nonconvex SGD under Heavy-Tailed Noise with Improved Results

Paper and Code