Abstract:Nesterov's accelerated gradient method (NAG) marks a pivotal advancement in gradient-based optimization, achieving faster convergence compared to the vanilla gradient descent method for convex functions. However, its algorithmic complexity when applied to strongly convex functions remains unknown, as noted in the comprehensive review by Chambolle and Pock [2016]. This issue, aside from the critical step size, was addressed by Li et al. [2024b], with the monotonic case further explored by Fu and Shi [2024]. In this paper, we introduce a family of controllable momentum coefficients for forward-backward accelerated methods, focusing on the critical step size $s=1/L$. Unlike traditional linear forms, the proposed momentum coefficients follow an $\alpha$-th power structure, where the parameter $r$ is adaptively tuned to $\alpha$. Using a Lyapunov function specifically designed for $\alpha$, we establish a controllable $O\left(1/k^{2\alpha} \right)$ convergence rate for the NAG-$\alpha$ method, provided that $r > 2\alpha$. At the critical step size, NAG-$\alpha$ achieves an inverse polynomial convergence rate of arbitrary degree by adjusting $r$ according to $\alpha > 0$. We further simplify the Lyapunov function by expressing it in terms of the iterative sequences $x_k$ and $y_k$, eliminating the need for phase-space representations. This simplification enables us to extend the controllable $O \left(1/k^{2\alpha} \right)$ rate to the monotonic variant, M-NAG-$\alpha$, thereby enhancing optimization efficiency. Finally, by leveraging the fundamental inequality for composite functions, we extended the controllable $O\left(1/k^{2\alpha} \right)$ rate to proximal algorithms, including the fast iterative shrinkage-thresholding algorithm (FISTA-$\alpha$) and its monotonic counterpart (M-FISTA-$\alpha$).
Abstract:In the realm of gradient-based optimization, Nesterov's accelerated gradient method (NAG) is a landmark advancement, achieving an accelerated convergence rate that outperforms the vanilla gradient descent method for convex function. However, for strongly convex functions, whether NAG converges linearly remains an open question, as noted in the comprehensive review by Chambolle and Pock [2016]. This issue, aside from the critical step size, was addressed by Li et al. [2024a] using a high-resolution differential equation framework. Furthermore, Beck [2017, Section 10.7.4] introduced a monotonically convergent variant of NAG, referred to as M-NAG. Despite these developments, the Lyapunov analysis presented in [Li et al., 2024a] cannot be directly extended to M-NAG. In this paper, we propose a modification to the iterative relation by introducing a gradient term, leading to a new gradient-based iterative relation. This adjustment allows for the construction of a novel Lyapunov function that excludes kinetic energy. The linear convergence derived from this Lyapunov function is independent of both the parameters of the strongly convex functions and the step size, yielding a more general and robust result. Notably, we observe that the gradient iterative relation derived from M-NAG is equivalent to that from NAG when the position-velocity relation is applied. However, the Lyapunov analysis does not rely on the position-velocity relation, allowing us to extend the linear convergence to M-NAG. Finally, by utilizing two proximal inequalities, which serve as the proximal counterparts of strongly convex inequalities, we extend the linear convergence to both the fast iterative shrinkage-thresholding algorithm (FISTA) and its monotonic counterpart (M-FISTA).