Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Panayotis Mertikopoulos

The impact of uncertainty on regularized learning in games

Jun 16, 2025

Pierre-Louis Cauvin, Davide Legacci, Panayotis Mertikopoulos

Abstract:In this paper, we investigate how randomness and uncertainty influence learning in games. Specifically, we examine a perturbed variant of the dynamics of "follow-the-regularized-leader" (FTRL), where the players' payoff observations and strategy updates are continually impacted by random shocks. Our findings reveal that, in a fairly precise sense, "uncertainty favors extremes": in any game, regardless of the noise level, every player's trajectory of play reaches an arbitrarily small neighborhood of a pure strategy in finite time (which we estimate). Moreover, even if the player does not ultimately settle at this strategy, they return arbitrarily close to some (possibly different) pure strategy infinitely often. This prompts the question of which sets of pure strategies emerge as robust predictions of learning under uncertainty. We show that (a) the only possible limits of the FTRL dynamics under uncertainty are pure Nash equilibria; and (b) a span of pure strategies is stable and attracting if and only if it is closed under better replies. Finally, we turn to games where the deterministic dynamics are recurrent - such as zero-sum games with interior equilibria - and we show that randomness disrupts this behavior, causing the stochastic dynamics to drift toward the boundary on average.

* 50 pages, 6 figures

Via

Access Paper or Ask Questions

The global convergence time of stochastic gradient descent in non-convex landscapes: Sharp estimates via large deviations

Mar 20, 2025

Waïss Azizian, Franck Iutzeler, Jérôme Malick, Panayotis Mertikopoulos

Abstract:In this paper, we examine the time it takes for stochastic gradient descent (SGD) to reach the global minimum of a general, non-convex loss function. We approach this question through the lens of randomly perturbed dynamical systems and large deviations theory, and we provide a tight characterization of the global convergence time of SGD via matching upper and lower bounds. These bounds are dominated by the most "costly" set of obstacles that the algorithm may need to overcome to reach a global minimizer from a given initialization, coupling in this way the global geometry of the underlying loss landscape with the statistics of the noise entering the process. Finally, motivated by applications to the training of deep neural networks, we also provide a series of refinements and extensions of our analysis for loss functions with shallow local minima.

* 62 pages, 5 figures

Via

Access Paper or Ask Questions

Accelerated regularized learning in finite N-person games

Dec 29, 2024

Kyriakos Lotidis, Angeliki Giannou, Panayotis Mertikopoulos, Nicholas Bambos

Abstract:Motivated by the success of Nesterov's accelerated gradient algorithm for convex minimization problems, we examine whether it is possible to achieve similar performance gains in the context of online learning in games. To that end, we introduce a family of accelerated learning methods, which we call "follow the accelerated leader" (FTXL), and which incorporates the use of momentum within the general framework of regularized learning - and, in particular, the exponential/multiplicative weights algorithm and its variants. Drawing inspiration and techniques from the continuous-time analysis of Nesterov's algorithm, we show that FTXL converges locally to strict Nash equilibria at a superlinear rate, achieving in this way an exponential speed-up over vanilla regularized learning methods (which, by comparison, converge to strict equilibria at a geometric, linear rate). Importantly, FTXL maintains its superlinear convergence rate in a broad range of feedback structures, from deterministic, full information models to stochastic, realization-based ones, and even when run with bandit, payoff-based information, where players are only able to observe their individual realized payoffs.

* 30 pages, 4 figures

Via

Access Paper or Ask Questions

No-regret learning in harmonic games: Extrapolation in the face of conflicting interests

Dec 28, 2024

Davide Legacci, Panayotis Mertikopoulos, Christos H. Papadimitriou, Georgios Piliouras, Bary S. R. Pradelski

Figure 1 for No-regret learning in harmonic games: Extrapolation in the face of conflicting interests

Figure 2 for No-regret learning in harmonic games: Extrapolation in the face of conflicting interests

Figure 3 for No-regret learning in harmonic games: Extrapolation in the face of conflicting interests

Abstract:The long-run behavior of multi-agent learning - and, in particular, no-regret learning - is relatively well-understood in potential games, where players have aligned interests. By contrast, in harmonic games - the strategic counterpart of potential games, where players have conflicting interests - very little is known outside the narrow subclass of 2-player zero-sum games with a fully-mixed equilibrium. Our paper seeks to partially fill this gap by focusing on the full class of (generalized) harmonic games and examining the convergence properties of follow-the-regularized-leader (FTRL), the most widely studied class of no-regret learning schemes. As a first result, we show that the continuous-time dynamics of FTRL are Poincar\'e recurrent, that is, they return arbitrarily close to their starting point infinitely often, and hence fail to converge. In discrete time, the standard, "vanilla" implementation of FTRL may lead to even worse outcomes, eventually trapping the players in a perpetual cycle of best-responses. However, if FTRL is augmented with a suitable extrapolation step - which includes as special cases the optimistic and mirror-prox variants of FTRL - we show that learning converges to a Nash equilibrium from any initial condition, and all players are guaranteed at most O(1) regret. These results provide an in-depth understanding of no-regret learning in harmonic games, nesting prior work on 2-player zero-sum games, and showing at a high level that harmonic games are the canonical complement of potential games, not only from a strategic, but also from a dynamic viewpoint.

* 36 pages, 5 figures

Via

Access Paper or Ask Questions

Nested replicator dynamics, nested logit choice, and similarity-based learning

Jul 25, 2024

Panayotis Mertikopoulos, William H. Sandholm

Abstract:We consider a model of learning and evolution in games whose action sets are endowed with a partition-based similarity structure intended to capture exogenous similarities between strategies. In this model, revising agents have a higher probability of comparing their current strategy with other strategies that they deem similar, and they switch to the observed strategy with probability proportional to its payoff excess. Because of this implicit bias toward similar strategies, the resulting dynamics - which we call the nested replicator dynamics - do not satisfy any of the standard monotonicity postulates for imitative game dynamics; nonetheless, we show that they retain the main long-run rationality properties of the replicator dynamics, albeit at quantitatively different rates. We also show that the induced dynamics can be viewed as a stimulus-response model in the spirit of Erev & Roth (1998), with choice probabilities given by the nested logit choice rule of Ben-Akiva (1973) and McFadden (1978). This result generalizes an existing relation between the replicator dynamics and the exponential weights algorithm in online learning, and provides an additional layer of interpretation to our analysis and results.

* 37 pages, 9 figures

Via

Access Paper or Ask Questions

What is the long-run distribution of stochastic gradient descent? A large deviations analysis

Jun 13, 2024

Waïss Azizian, Franck Iutzeler, Jérôme Malick, Panayotis Mertikopoulos

Figure 1 for What is the long-run distribution of stochastic gradient descent? A large deviations analysis

Abstract:In this paper, we examine the long-run distribution of stochastic gradient descent (SGD) in general, non-convex problems. Specifically, we seek to understand which regions of the problem's state space are more likely to be visited by SGD, and by how much. Using an approach based on the theory of large deviations and randomly perturbed dynamical systems, we show that the long-run distribution of SGD resembles the Boltzmann-Gibbs distribution of equilibrium thermodynamics with temperature equal to the method's step-size and energy levels determined by the problem's objective and the statistics of the noise. In particular, we show that, in the long run, (a) the problem's critical region is visited exponentially more often than any non-critical region; (b) the iterates of SGD are exponentially concentrated around the problem's minimum energy state (which does not always coincide with the global minimum of the objective); (c) all other connected components of critical points are visited with frequency that is exponentially proportional to their energy level; and, finally (d) any component of local maximizers or saddle points is "dominated" by a component of local minimizers which is visited exponentially more often.

* 70 pages, 3 figures; to be published in the proceedings of ICML 2024

Via

Access Paper or Ask Questions

Tamed Langevin sampling under weaker conditions

May 27, 2024

Iosif Lytras, Panayotis Mertikopoulos

Figure 1 for Tamed Langevin sampling under weaker conditions

Figure 2 for Tamed Langevin sampling under weaker conditions

Figure 3 for Tamed Langevin sampling under weaker conditions

Abstract:Motivated by applications to deep learning which often fail standard Lipschitz smoothness requirements, we examine the problem of sampling from distributions that are not log-concave and are only weakly dissipative, with log-gradients allowed to grow superlinearly at infinity. In terms of structure, we only assume that the target distribution satisfies either a log-Sobolev or a Poincar\'e inequality and a local Lipschitz smoothness assumption with modulus growing possibly polynomially at infinity. This set of assumptions greatly exceeds the operational limits of the "vanilla" unadjusted Langevin algorithm (ULA), making sampling from such distributions a highly involved affair. To account for this, we introduce a taming scheme which is tailored to the growth and decay properties of the target distribution, and we provide explicit non-asymptotic guarantees for the proposed sampler in terms of the Kullback-Leibler (KL) divergence, total variation, and Wasserstein distance to the target distribution.

* 32 pages, 2 figures

Via

Access Paper or Ask Questions

A geometric decomposition of finite games: Convergence vs. recurrence under no-regret learning

May 12, 2024

Davide Legacci, Panayotis Mertikopoulos, Bary Pradelski

Figure 1 for A geometric decomposition of finite games: Convergence vs. recurrence under no-regret learning

Figure 2 for A geometric decomposition of finite games: Convergence vs. recurrence under no-regret learning

Figure 3 for A geometric decomposition of finite games: Convergence vs. recurrence under no-regret learning

Figure 4 for A geometric decomposition of finite games: Convergence vs. recurrence under no-regret learning

Abstract:In view of the complexity of the dynamics of no-regret learning in games, we seek to decompose a finite game into simpler components where the day-to-day behavior of the dynamics is well understood. A natural starting point for this is Helmholtz's theorem, which resolves a vector field into a potential and an incompressible component. However, the geometry of no-regret dynamics - and, in particular, the dynamics of exponential / multiplicative weights (EW) schemes - is not compatible with the Euclidean underpinnings of Helmholtz's theorem, leading us to consider a Riemannian framework based on the Shahshahani metric. Using this geometric construction, we introduce the class of incompressible games, and we prove the following results: First, in addition to being volume-preserving, the continuous-time EW dynamics in incompressible games admit a constant of motion and are Poincar\'e recurrent - i.e., almost every trajectory of play comes arbitrarily close to its starting point infinitely often. Second, we establish a deep connection with a well-known decomposition of games into a potential and harmonic component (where the players' objectives are aligned and anti-aligned respectively): a game is incompressible if and only if it is harmonic, implying in turn that the EW dynamics lead to Poincar\'e recurrence in harmonic games.

* 50 pages, 16 figures

Via

Access Paper or Ask Questions

Exploiting hidden structures in non-convex games for convergence to Nash equilibrium

Dec 27, 2023

Iosif Sakos, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Panayotis Mertikopoulos, Georgios Piliouras

Abstract:A wide array of modern machine learning applications - from adversarial models to multi-agent reinforcement learning - can be formulated as non-cooperative games whose Nash equilibria represent the system's desired operational states. Despite having a highly non-convex loss landscape, many cases of interest possess a latent convex structure that could potentially be leveraged to yield convergence to equilibrium. Driven by this observation, our paper proposes a flexible first-order method that successfully exploits such "hidden structures" and achieves convergence under minimal assumptions for the transformation connecting the players' control variables to the game's latent, convex-structured layer. The proposed method - which we call preconditioned hidden gradient descent (PHGD) - hinges on a judiciously chosen gradient preconditioning scheme related to natural gradient methods. Importantly, we make no separability assumptions for the game's hidden structure, and we provide explicit convergence rate guarantees for both deterministic and stochastic environments.

* 32 pages, 18 figures

Via

Access Paper or Ask Questions

A Quadratic Speedup in Finding Nash Equilibria of Quantum Zero-Sum Games

Nov 17, 2023

Francisca Vasconcelos, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Panayotis Mertikopoulos, Georgios Piliouras, Michael I. Jordan

Abstract:Recent developments in domains such as non-local games, quantum interactive proofs, and quantum generative adversarial networks have renewed interest in quantum game theory and, specifically, quantum zero-sum games. Central to classical game theory is the efficient algorithmic computation of Nash equilibria, which represent optimal strategies for both players. In 2008, Jain and Watrous proposed the first classical algorithm for computing equilibria in quantum zero-sum games using the Matrix Multiplicative Weight Updates (MMWU) method to achieve a convergence rate of $\mathcal{O}(d/\epsilon^2)$ iterations to $\epsilon$-Nash equilibria in the $4^d$-dimensional spectraplex. In this work, we propose a hierarchy of quantum optimization algorithms that generalize MMWU via an extra-gradient mechanism. Notably, within this proposed hierarchy, we introduce the Optimistic Matrix Multiplicative Weights Update (OMMWU) algorithm and establish its average-iterate convergence complexity as $\mathcal{O}(d/\epsilon)$ iterations to $\epsilon$-Nash equilibria. This quadratic speed-up relative to Jain and Watrous' original algorithm sets a new benchmark for computing $\epsilon$-Nash equilibria in quantum zero-sum games.

* 53 pages, 7 figures, QTML 2023 (Accepted (Long Talk))

Via

Access Paper or Ask Questions