Abstract:Higher-order interactions underlie complex phenomena in systems such as biological and artificial neural networks, but their study is challenging due to the lack of tractable standard models. By leveraging the maximum entropy principle in curved statistical manifolds, here we introduce curved neural networks as a class of prototypical models for studying higher-order phenomena. Through exact mean-field descriptions, we show that these curved neural networks implement a self-regulating annealing process that can accelerate memory retrieval, leading to explosive order-disorder phase transitions with multi-stability and hysteresis effects. Moreover, by analytically exploring their memory capacity using the replica trick near ferromagnetic and spin-glass phase boundaries, we demonstrate that these networks enhance memory capacity over the classical associative-memory networks. Overall, the proposed framework provides parsimonious models amenable to analytical study, revealing novel higher-order phenomena in complex network systems.
Abstract:Transformer-based models have demonstrated exceptional performance across diverse domains, becoming the state-of-the-art solution for addressing sequential machine learning problems. Even though we have a general understanding of the fundamental components in the transformer architecture, little is known about how they operate or what are their expected dynamics. Recently, there has been an increasing interest in exploring the relationship between attention mechanisms and Hopfield networks, promising to shed light on the statistical physics of transformer networks. However, to date, the dynamical regimes of transformer-like models have not been studied in depth. In this paper, we address this gap by using methods for the study of asymmetric Hopfield networks in nonequilibrium regimes --namely path integral methods over generating functionals, yielding dynamics governed by concurrent mean-field variables. Assuming 1-bit tokens and weights, we derive analytical approximations for the behavior of large self-attention neural networks coupled to a softmax output, which become exact in the large limit size. Our findings reveal nontrivial dynamical phenomena, including nonequilibrium phase transitions associated with chaotic bifurcations, even for very simple configurations with a few encoded features and a very short context window. Finally, we discuss the potential of our analytic approach to improve our understanding of the inner workings of transformer models, potentially reducing computational training costs and enhancing model interpretability.
Abstract:Many biological and cognitive systems do not operate deep within one or other regime of activity. Instead, they are poised at critical points located at phase transitions in their parameter space. The pervasiveness of criticality suggests that there may be general principles inducing this behaviour, yet there is no well-founded theory for understanding how criticality is generated at a wide span of levels and contexts. In order to explore how criticality might emerge from general adaptive mechanisms, we propose a simple learning rule that maintains an internal organizational structure from a specific family of systems at criticality. We implement the mechanism in artificial embodied agents controlled by a neural network maintaining a correlation structure randomly sampled from an Ising model at critical temperature. Agents are evaluated in two classical reinforcement learning scenarios: the Mountain Car and the Acrobot double pendulum. In both cases the neural controller appears to reach a point of criticality, which coincides with a transition point between two regimes of the agent's behaviour. These results suggest that adaptation to criticality could be used as a general adaptive mechanism in some circumstances, providing an alternative explanation for the pervasive presence of criticality in biological and cognitive systems.
Abstract:This paper outlines a methodological approach for designing adaptive agents driving themselves near points of criticality. Using a synthetic approach we construct a conceptual model that, instead of specifying mechanistic requirements to generate criticality, exploits the maintenance of an organizational structure capable of reproducing critical behavior. Our approach exploits the well-known principle of universality, which classifies critical phenomena inside a few universality classes of systems independently of their specific mechanisms or topologies. In particular, we implement an artificial embodied agent controlled by a neural network maintaining a correlation structure randomly sampled from a lattice Ising model at a critical point. We evaluate the agent in two classical reinforcement learning scenarios: the Mountain Car benchmark and the Acrobot double pendulum, finding that in both cases the neural controller reaches a point of criticality, which coincides with a transition point between two regimes of the agent's behaviour, maximizing the mutual information between neurons and sensorimotor patterns. Finally, we discuss the possible applications of this synthetic approach to the comprehension of deeper principles connected to the pervasive presence of criticality in biological and cognitive systems.
Abstract:Many biological and cognitive systems do not operate deep into one or other regime of activity. Instead, they exploit critical surfaces poised at transitions in their parameter space. The pervasiveness of criticality in natural systems suggests that there may be general principles inducing this behaviour. However, there is a lack of conceptual models explaining how embodied agents propel themselves towards these critical points. In this paper, we present a learning model driving an embodied Boltzmann Machine towards critical behaviour by maximizing the heat capacity of the network. We test and corroborate the model implementing an embodied agent in the mountain car benchmark, controlled by a Boltzmann Machine that adjust its weights according to the model. We find that the neural controller reaches a point of criticality, which coincides with a transition point of the behaviour of the agent between two regimes of behaviour, maximizing the synergistic information between its sensors and the hidden and motor neurons. Finally, we discuss the potential of our learning model to study the contribution of criticality to the behaviour of embodied living systems in scenarios not necessarily constrained by biological restrictions of the examples of criticality we find in nature.