Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Quoc Tran-Dinh

Variance-Reduced Fast Operator Splitting Methods for Stochastic Generalized Equations

Apr 17, 2025

Quoc Tran-Dinh

Abstract:We develop two classes of variance-reduced fast operator splitting methods to approximate solutions of both finite-sum and stochastic generalized equations. Our approach integrates recent advances in accelerated fixed-point methods, co-hypomonotonicity, and variance reduction. First, we introduce a class of variance-reduced estimators and establish their variance-reduction bounds. This class covers both unbiased and biased instances and comprises common estimators as special cases, including SVRG, SAGA, SARAH, and Hybrid-SGD. Next, we design a novel accelerated variance-reduced forward-backward splitting (FBS) algorithm using these estimators to solve finite-sum and stochastic generalized equations. Our method achieves both $\mathcal{O}(1/k^2)$ and $o(1/k^2)$ convergence rates on the expected squared norm $\mathbb{E}[ \| G_{\lambda}x^k\|^2]$ of the FBS residual $G_{\lambda}$, where $k$ is the iteration counter. Additionally, we establish, for the first time, almost sure convergence rates and almost sure convergence of iterates to a solution in stochastic accelerated methods. Unlike existing stochastic fixed-point algorithms, our methods accommodate co-hypomonotone operators, which potentially include nonmonotone problems arising from recent applications. We further specify our method to derive an appropriate variant for each stochastic estimator -- SVRG, SAGA, SARAH, and Hybrid-SGD -- demonstrating that they achieve the best-known complexity for each without relying on enhancement techniques. Alternatively, we propose an accelerated variance-reduced backward-forward splitting (BFS) method, which attains similar convergence rates and oracle complexity as our FBS method. Finally, we validate our results through several numerical experiments and compare their performance.

* 58 pages, 1 table, and 8 figures

Via

Access Paper or Ask Questions

Accelerated Extragradient-Type Methods -- Part 2: Generalization and Sublinear Convergence Rates under Co-Hypomonotonicity

Jan 08, 2025

Quoc Tran-Dinh, Nghia Nguyen-Trung

Abstract:Following the first part of our project, this paper comprehensively studies two types of extragradient-based methods: anchored extragradient and Nesterov's accelerated extragradient for solving [non]linear inclusions (and, in particular, equations), primarily under the Lipschitz continuity and the co-hypomonotonicity assumptions. We unify and generalize a class of anchored extragradient methods for monotone inclusions to a wider range of schemes encompassing existing algorithms as special cases. We establish $\mathcal{O}(1/k)$ last-iterate convergence rates on the residual norm of the underlying mapping for this general framework and then specialize it to obtain convergence guarantees for specific instances, where $k$ denotes the iteration counter. We extend our approach to a class of anchored Tseng's forward-backward-forward splitting methods to obtain a broader class of algorithms for solving co-hypomonotone inclusions. Again, we analyze $\mathcal{O}(1/k)$ last-iterate convergence rates for this general scheme and specialize it to obtain convergence results for existing and new variants. We generalize and unify Nesterov's accelerated extra-gradient method to a new class of algorithms that covers existing schemes as special instances while generating new variants. For these schemes, we can prove $\mathcal{O}(1/k)$ last-iterate convergence rates for the residual norm under co-hypomonotonicity, covering a class of nonmonotone problems. We propose another novel class of Nesterov's accelerated extragradient methods to solve inclusions. Interestingly, these algorithms achieve both $\mathcal{O}(1/k)$ and $o(1/k)$ last-iterate convergence rates, and also the convergence of iterate sequences under co-hypomonotonicity and Lipschitz continuity. Finally, we provide a set of numerical experiments encompassing different scenarios to validate our algorithms and theoretical guarantees.

* 75 pages, 7 figures, and 1 table

Via

Access Paper or Ask Questions

Shuffling Gradient-Based Methods for Nonconvex-Concave Minimax Optimization

Oct 29, 2024

Quoc Tran-Dinh, Trang H. Tran, Lam M. Nguyen

Abstract:This paper aims at developing novel shuffling gradient-based methods for tackling two classes of minimax problems: nonconvex-linear and nonconvex-strongly concave settings. The first algorithm addresses the nonconvex-linear minimax model and achieves the state-of-the-art oracle complexity typically observed in nonconvex optimization. It also employs a new shuffling estimator for the "hyper-gradient", departing from standard shuffling techniques in optimization. The second method consists of two variants: semi-shuffling and full-shuffling schemes. These variants tackle the nonconvex-strongly concave minimax setting. We establish their oracle complexity bounds under standard assumptions, which, to our best knowledge, are the best-known for this specific setting. Numerical examples demonstrate the performance of our algorithms and compare them with two other methods. Our results show that the new methods achieve comparable performance with SGD, supporting the potential of incorporating shuffling strategies into minimax algorithms.

* 38th Conference on Neural Information Processing Systems (NeurIPS 2024
* 45 pages, 5 figures (38th Conference on Neural Information Processing Systems (NeurIPS 2024))

Via

Access Paper or Ask Questions

Revisiting Extragradient-Type Methods -- Part 1: Generalizations and Sublinear Convergence Rates

Sep 25, 2024

Quoc Tran-Dinh, Nghia Nguyen-Trung

Abstract:This paper presents a comprehensive analysis of the well-known extragradient (EG) method for solving both equations and inclusions. First, we unify and generalize EG for [non]linear equations to a wider class of algorithms, encompassing various existing schemes and potentially new variants. Next, we analyze both sublinear ``best-iterate'' and ``last-iterate'' convergence rates for the entire class of algorithms, and derive new convergence results for two well-known instances. Second, we extend our EG framework above to ``monotone'' inclusions, introducing a new class of algorithms and its corresponding convergence results. Third, we also unify and generalize Tseng's forward-backward-forward splitting (FBFS) method to a broader class of algorithms to solve [non]linear inclusions when a weak-Minty solution exists, and establish its ``best-iterate'' convergence rate. Fourth, to complete our picture, we also investigate sublinear rates of two other common variants of EG using our EG analysis framework developed here: the reflected forward-backward splitting and the golden ratio methods. Finally, we conduct an extensive numerical experiment to validate our theoretical findings. Our results demonstrate that several new variants of our proposed algorithms outperform existing schemes in the majority of examples.

* 59 pages, 1 table, 9 figures. arXiv admin note: text overlap with arXiv:2303.17192

Via

Access Paper or Ask Questions

Accelerated Variance-Reduced Forward-Reflected Methods for Root-Finding Problems

Jun 04, 2024

Quoc Tran-Dinh

Abstract:We propose a novel class of Nesterov's stochastic accelerated forward-reflected-based methods with variance reduction to solve root-finding problems under $\frac{1}{L}$-co-coerciveness. Our algorithm is single-loop and leverages a new family of unbiased variance-reduced estimators specifically designed for root-finding problems. It achieves both $\mathcal{O}(L^2/k^2)$ and $o(1/k^2)$-last-iterate convergence rates in terms of expected operator squared norm, where $k$ denotes the iteration counter. We instantiate our framework for two prominent estimators: SVRG and SAGA. By an appropriate choice of parameters, both variants attain an oracle complexity of $\mathcal{O}( n + Ln^{2/3}\epsilon^{-1})$ to reach an $\epsilon$-solution, where $n$ represents the number of summands in the finite-sum operator. Furthermore, under $\mu$-strong quasi-monotonicity, our method achieves a linear convergence rate and an oracle complexity of $\mathcal{O}(n+ \kappa n^{2/3}\log(\epsilon^{-1}))$, where $\kappa := \frac{L}{\mu}$. We extend our approach to solve a class of finite-sum monotone inclusions, demonstrating that our schemes retain the same theoretical guarantees as in the equation setting. Finally, numerical experiments validate our algorithms and demonstrate their promising performance compared to state-of-the-art methods.

* 26 pages, 2 figures

Via

Access Paper or Ask Questions

Shuffling Momentum Gradient Algorithm for Convex Optimization

Mar 05, 2024

Trang H. Tran, Quoc Tran-Dinh, Lam M. Nguyen

Abstract:The Stochastic Gradient Descent method (SGD) and its stochastic variants have become methods of choice for solving finite-sum optimization problems arising from machine learning and data science thanks to their ability to handle large-scale applications and big datasets. In the last decades, researchers have made substantial effort to study the theoretical performance of SGD and its shuffling variants. However, only limited work has investigated its shuffling momentum variants, including shuffling heavy-ball momentum schemes for non-convex problems and Nesterov's momentum for convex settings. In this work, we extend the analysis of the shuffling momentum gradient method developed in [Tran et al (2021)] to both finite-sum convex and strongly convex optimization problems. We provide the first analysis of shuffling momentum-based methods for the strongly convex setting, attaining a convergence rate of $O(1/nT^2)$, where $n$ is the number of samples and $T$ is the number of training epochs. Our analysis is a state-of-the-art, matching the best rates of existing shuffling stochastic gradient algorithms in the literature.

* Vietnam Journal of Mathematics (VJOM), Special issue dedicated to Dr. Tam\'as Terlaky on the occasion of his 70th birthday, 2024

Via

Access Paper or Ask Questions

Sublinear Convergence Rates of Extragradient-Type Methods: A Survey on Classical and Recent Developments

Mar 30, 2023

Quoc Tran-Dinh

Abstract:The extragradient (EG), introduced by G. M. Korpelevich in 1976, is a well-known method to approximate solutions of saddle-point problems and their extensions such as variational inequalities and monotone inclusions. Over the years, numerous variants of EG have been proposed and studied in the literature. Recently, these methods have gained popularity due to new applications in machine learning and robust optimization. In this work, we survey the latest developments in the EG method and its variants for approximating solutions of nonlinear equations and inclusions, with a focus on the monotonicity and co-hypomonotonicity settings. We provide a unified convergence analysis for different classes of algorithms, with an emphasis on sublinear best-iterate and last-iterate convergence rates. We also discuss recent accelerated variants of EG based on both Halpern fixed-point iteration and Nesterov's accelerated techniques. Our approach uses simple arguments and basic mathematical tools to make the proofs as elementary as possible, while maintaining generality to cover a broad range of problems.

* 47 pages, a survey paper

Via

Access Paper or Ask Questions

Extragradient-Type Methods with $\mathcal{O}(1/k)$ Convergence Rates for Co-Hypomonotone Inclusions

Feb 08, 2023

Quoc Tran-Dinh

Abstract:In this paper, we develop two ``Nesterov's accelerated'' variants of the well-known extragradient method to approximate a solution of a co-hypomonotone inclusion constituted by the sum of two operators, where one is Lipschitz continuous and the other is possibly multivalued. The first scheme can be viewed as an accelerated variant of Tseng's forward-backward-forward splitting method, while the second one is a variant of the reflected forward-backward splitting method, which requires only one evaluation of the Lipschitz operator, and one resolvent of the multivalued operator. Under a proper choice of the algorithmic parameters and appropriate conditions on the co-hypomonotone parameter, we theoretically prove that both algorithms achieve $\mathcal{O}(1/k)$ convergence rates on the norm of the residual, where $k$ is the iteration counter. Our results can be viewed as alternatives of a recent class of Halpern-type schemes for root-finding problems.

* 26 pages

Via

Access Paper or Ask Questions

Accelerated Randomized Block-Coordinate Algorithms for Co-coercive Equations and Applications

Jan 08, 2023

Quoc Tran-Dinh

Abstract:In this paper, we develop an accelerated randomized block-coordinate algorithm to approximate a solution of a co-coercive equation. Such an equation plays a central role in optimization and related fields and covers many mathematical models as special cases, including convex optimization, convex-concave minimax, and variational inequality problems. Our algorithm relies on a recent Nesterov's accelerated interpretation of the Halpern fixed-point iteration in [48]. We establish that the new algorithm achieves $\mathcal{O}(1/k^2)$-convergence rate on $\mathbb{E}[\Vert Gx^k\Vert^2]$ through the last-iterate, where $G$ is the underlying co-coercive operator, $\mathbb{E}[\cdot]$ is the expectation, and $k$ is the iteration counter. This rate is significantly faster than $\mathcal{O}(1/k)$ rates in standard forward or gradient-based methods from the literature. We also prove $o(1/k^2)$ rates on both $\mathbb{E}[\Vert Gx^k\Vert^2]$ and $\mathbb{E}[\Vert x^{k+1} - x^{k}\Vert^2]$. Next, we apply our method to derive two accelerated randomized block coordinate variants of the forward-backward splitting and Douglas-Rachford splitting schemes, respectively for solving a monotone inclusion involving the sum of two operators. As a byproduct, these variants also have faster convergence rates than their non-accelerated counterparts. Finally, we apply our scheme to a finite-sum monotone inclusion that has various applications in machine learning and statistical learning, including federated learning. As a result, we obtain a novel federated learning-type algorithm with fast and provable convergence rates.

* 24 pages

Via

Access Paper or Ask Questions

Gradient Descent-Type Methods: Background and Simple Unified Convergence Analysis

Dec 19, 2022

Quoc Tran-Dinh, Marten van Dijk

Abstract:In this book chapter, we briefly describe the main components that constitute the gradient descent method and its accelerated and stochastic variants. We aim at explaining these components from a mathematical point of view, including theoretical and practical aspects, but at an elementary level. We will focus on basic variants of the gradient descent method and then extend our view to recent variants, especially variance-reduced stochastic gradient schemes (SGD). Our approach relies on revealing the structures presented inside the problem and the assumptions imposed on the objective function. Our convergence analysis unifies several known results and relies on a general, but elementary recursive expression. We have illustrated this analysis on several common schemes.

* 24 pages, a book chapter

Via

Access Paper or Ask Questions