Picture for Dmitry Kovalev

Dmitry Kovalev

Optimal Projection-Free Adaptive SGD for Matrix Optimization

Add code
Apr 02, 2026
Viaarxiv icon

Muon is Provably Faster with Momentum Variance Reduction

Add code
Dec 18, 2025
Figure 1 for Muon is Provably Faster with Momentum Variance Reduction
Figure 2 for Muon is Provably Faster with Momentum Variance Reduction
Figure 3 for Muon is Provably Faster with Momentum Variance Reduction
Figure 4 for Muon is Provably Faster with Momentum Variance Reduction
Viaarxiv icon

Non-Euclidean SGD for Structured Optimization: Unified Analysis and Improved Rates

Add code
Nov 14, 2025
Viaarxiv icon

Understanding Gradient Orthogonalization for Deep Learning via Non-Euclidean Trust-Region Optimization

Add code
Mar 16, 2025
Viaarxiv icon

On Linear Convergence in Smooth Convex-Concave Bilinearly-Coupled Saddle-Point Optimization: Lower Bounds and Optimal Algorithms

Add code
Nov 21, 2024
Figure 1 for On Linear Convergence in Smooth Convex-Concave Bilinearly-Coupled Saddle-Point Optimization: Lower Bounds and Optimal Algorithms
Figure 2 for On Linear Convergence in Smooth Convex-Concave Bilinearly-Coupled Saddle-Point Optimization: Lower Bounds and Optimal Algorithms
Viaarxiv icon

An Optimal Algorithm for Strongly Convex Min-min Optimization

Add code
Dec 29, 2022
Viaarxiv icon

Smooth Monotone Stochastic Variational Inequalities and Saddle Point Problems -- Survey

Add code
Aug 29, 2022
Viaarxiv icon

Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with Inexact Prox

Add code
Jul 08, 2022
Figure 1 for Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with Inexact Prox
Figure 2 for Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with Inexact Prox
Figure 3 for Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with Inexact Prox
Viaarxiv icon

On Scaled Methods for Saddle Point Problems

Add code
Jun 16, 2022
Figure 1 for On Scaled Methods for Saddle Point Problems
Figure 2 for On Scaled Methods for Saddle Point Problems
Figure 3 for On Scaled Methods for Saddle Point Problems
Figure 4 for On Scaled Methods for Saddle Point Problems
Viaarxiv icon

Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity

Add code
May 30, 2022
Figure 1 for Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity
Figure 2 for Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity
Figure 3 for Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity
Viaarxiv icon