Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Manuel Brenner

Uncovering the Functional Roles of Nonlinearity in Memory

Jun 09, 2025

Manuel Brenner, Georgia Koppe

Figure 1 for Uncovering the Functional Roles of Nonlinearity in Memory

Figure 2 for Uncovering the Functional Roles of Nonlinearity in Memory

Figure 3 for Uncovering the Functional Roles of Nonlinearity in Memory

Figure 4 for Uncovering the Functional Roles of Nonlinearity in Memory

Abstract:Memory and long-range temporal processing are core requirements for sequence modeling tasks across natural language processing, time-series forecasting, speech recognition, and control. While nonlinear recurrence has long been viewed as essential for enabling such mechanisms, recent work suggests that linear dynamics may often suffice. In this study, we go beyond performance comparisons to systematically dissect the functional role of nonlinearity in recurrent networks--identifying both when it is computationally necessary, and what mechanisms it enables. We use Almost Linear Recurrent Neural Networks (AL-RNNs), which allow fine-grained control over nonlinearity, as both a flexible modeling tool and a probe into the internal mechanisms of memory. Across a range of classic sequence modeling tasks and a real-world stimulus selection task, we find that minimal nonlinearity is not only sufficient but often optimal, yielding models that are simpler, more robust, and more interpretable than their fully nonlinear or linear counterparts. Our results provide a principled framework for selectively introducing nonlinearity, bridging dynamical systems theory with the functional demands of long-range memory and structured computation in recurrent neural networks, with implications for both artificial and biological neural systems.

* Preprint under review

Via

Access Paper or Ask Questions

Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction

Oct 18, 2024

Manuel Brenner, Christoph Jürgen Hemmer, Zahra Monfared, Daniel Durstewitz

Figure 1 for Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction

Figure 2 for Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction

Figure 3 for Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction

Figure 4 for Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction

Abstract:Dynamical systems (DS) theory is fundamental for many areas of science and engineering. It can provide deep insights into the behavior of systems evolving in time, as typically described by differential or recursive equations. A common approach to facilitate mathematical tractability and interpretability of DS models involves decomposing nonlinear DS into multiple linear DS separated by switching manifolds, i.e. piecewise linear (PWL) systems. PWL models are popular in engineering and a frequent choice in mathematics for analyzing the topological properties of DS. However, hand-crafting such models is tedious and only possible for very low-dimensional scenarios, while inferring them from data usually gives rise to unnecessarily complex representations with very many linear subregions. Here we introduce Almost-Linear Recurrent Neural Networks (AL-RNNs) which automatically and robustly produce most parsimonious PWL representations of DS from time series data, using as few PWL nonlinearities as possible. AL-RNNs can be efficiently trained with any SOTA algorithm for dynamical systems reconstruction (DSR), and naturally give rise to a symbolic encoding of the underlying DS that provably preserves important topological properties. We show that for the Lorenz and R\"ossler systems, AL-RNNs discover, in a purely data-driven way, the known topologically minimal PWL representations of the corresponding chaotic attractors. We further illustrate on two challenging empirical datasets that interpretable symbolic encodings of the dynamics can be achieved, tremendously facilitating mathematical and computational analysis of the underlying systems.

* 38th Conference on Neural Information Processing Systems (NeurIPS 2024)

Via

Access Paper or Ask Questions

Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Oct 07, 2024

Manuel Brenner, Elias Weber, Georgia Koppe, Daniel Durstewitz

Figure 1 for Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Figure 2 for Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Figure 3 for Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Figure 4 for Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Abstract:In science, we are often interested in obtaining a generative model of the underlying system dynamics from observed time series. While powerful methods for dynamical systems reconstruction (DSR) exist when data come from a single domain, how to best integrate data from multiple dynamical regimes and leverage it for generalization is still an open question. This becomes particularly important when individual time series are short, and group-level information may help to fill in for gaps in single-domain data. At the same time, averaging is not an option in DSR, as it will wipe out crucial dynamical properties (e.g., limit cycles in one domain vs. chaos in another). Hence, a framework is needed that enables to efficiently harvest group-level (multi-domain) information while retaining all single-domain dynamical characteristics. Here we provide such a hierarchical approach and showcase it on popular DSR benchmarks, as well as on neuroscientific and medical time series. In addition to faithful reconstruction of all individual dynamical regimes, our unsupervised methodology discovers common low-dimensional feature spaces in which datasets with similar dynamics cluster. The features spanning these spaces were further dynamically highly interpretable, surprisingly in often linear relation to control parameters that govern the dynamics of the underlying system. Finally, we illustrate transfer learning and generalization to new parameter regimes.

* Preprint

Via

Access Paper or Ask Questions

Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction

Jun 07, 2024

Christoph Jürgen Hemmer, Manuel Brenner, Florian Hess, Daniel Durstewitz

Figure 1 for Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction

Figure 2 for Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction

Figure 3 for Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction

Figure 4 for Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction

Abstract:In dynamical systems reconstruction (DSR) we seek to infer from time series measurements a generative model of the underlying dynamical process. This is a prime objective in any scientific discipline, where we are particularly interested in parsimonious models with a low parameter load. A common strategy here is parameter pruning, removing all parameters with small weights. However, here we find this strategy does not work for DSR, where even low magnitude parameters can contribute considerably to the system dynamics. On the other hand, it is well known that many natural systems which generate complex dynamics, like the brain or ecological networks, have a sparse topology with comparatively few links. Inspired by this, we show that geometric pruning, where in contrast to magnitude-based pruning weights with a low contribution to an attractor's geometrical structure are removed, indeed manages to reduce parameter load substantially without significantly hampering DSR quality. We further find that the networks resulting from geometric pruning have a specific type of topology, and that this topology, and not the magnitude of weights, is what is most crucial to performance. We provide an algorithm that automatically generates such topologies which can be used as priors for generative modeling of dynamical systems by RNNs, and compare it to other well studied topologies like small-world or scale-free networks.

Via

Access Paper or Ask Questions

Out-of-Domain Generalization in Dynamical Systems Reconstruction

Feb 28, 2024

Niclas Göring, Florian Hess, Manuel Brenner, Zahra Monfared, Daniel Durstewitz

Figure 1 for Out-of-Domain Generalization in Dynamical Systems Reconstruction

Figure 2 for Out-of-Domain Generalization in Dynamical Systems Reconstruction

Figure 3 for Out-of-Domain Generalization in Dynamical Systems Reconstruction

Figure 4 for Out-of-Domain Generalization in Dynamical Systems Reconstruction

Abstract:In science we are interested in finding the governing equations, the dynamical rules, underlying empirical phenomena. While traditionally scientific models are derived through cycles of human insight and experimentation, recently deep learning (DL) techniques have been advanced to reconstruct dynamical systems (DS) directly from time series data. State-of-the-art dynamical systems reconstruction (DSR) methods show promise in capturing invariant and long-term properties of observed DS, but their ability to generalize to unobserved domains remains an open challenge. Yet, this is a crucial property we would expect from any viable scientific theory. In this work, we provide a formal framework that addresses generalization in DSR. We explain why and how out-of-domain (OOD) generalization (OODG) in DSR profoundly differs from OODG considered elsewhere in machine learning. We introduce mathematical notions based on topological concepts and ergodic theory to formalize the idea of learnability of a DSR model. We formally prove that black-box DL techniques, without adequate structural priors, generally will not be able to learn a generalizing DSR model. We also show this empirically, considering major classes of DSR algorithms proposed so far, and illustrate where and why they fail to generalize across the whole phase space. Our study provides the first comprehensive mathematical treatment of OODG in DSR, and gives a deeper conceptual understanding of where the fundamental problems in OODG lie and how they could possibly be addressed in practice.

Via

Access Paper or Ask Questions

Generalized Teacher Forcing for Learning Chaotic Dynamics

Jun 07, 2023

Florian Hess, Zahra Monfared, Manuel Brenner, Daniel Durstewitz

Figure 1 for Generalized Teacher Forcing for Learning Chaotic Dynamics

Figure 2 for Generalized Teacher Forcing for Learning Chaotic Dynamics

Figure 3 for Generalized Teacher Forcing for Learning Chaotic Dynamics

Figure 4 for Generalized Teacher Forcing for Learning Chaotic Dynamics

Abstract:Chaotic dynamical systems (DS) are ubiquitous in nature and society. Often we are interested in reconstructing such systems from observed time series for prediction or mechanistic insight, where by reconstruction we mean learning geometrical and invariant temporal properties of the system in question (like attractors). However, training reconstruction algorithms like recurrent neural networks (RNNs) on such systems by gradient-descent based techniques faces severe challenges. This is mainly due to exploding gradients caused by the exponential divergence of trajectories in chaotic systems. Moreover, for (scientific) interpretability we wish to have as low dimensional reconstructions as possible, preferably in a model which is mathematically tractable. Here we report that a surprisingly simple modification of teacher forcing leads to provably strictly all-time bounded gradients in training on chaotic systems, and, when paired with a simple architectural rearrangement of a tractable RNN design, piecewise-linear RNNs (PLRNNs), allows for faithful reconstruction in spaces of at most the dimensionality of the observed system. We show on several DS that with these amendments we can reconstruct DS better than current SOTA algorithms, in much lower dimensions. Performance differences were particularly compelling on real world data with which most other methods severely struggled. This work thus led to a simple yet powerful DS reconstruction algorithm which is highly interpretable at the same time.

* To be published in the Proceedings of the 40th International Conference on Machine Learning (ICML 2023)

Via

Access Paper or Ask Questions

Multimodal Teacher Forcing for Reconstructing Nonlinear Dynamical Systems

Dec 15, 2022

Manuel Brenner, Georgia Koppe, Daniel Durstewitz

Figure 1 for Multimodal Teacher Forcing for Reconstructing Nonlinear Dynamical Systems

Figure 2 for Multimodal Teacher Forcing for Reconstructing Nonlinear Dynamical Systems

Figure 3 for Multimodal Teacher Forcing for Reconstructing Nonlinear Dynamical Systems

Figure 4 for Multimodal Teacher Forcing for Reconstructing Nonlinear Dynamical Systems

Abstract:Many, if not most, systems of interest in science are naturally described as nonlinear dynamical systems (DS). Empirically, we commonly access these systems through time series measurements, where often we have time series from different types of data modalities simultaneously. For instance, we may have event counts in addition to some continuous signal. While by now there are many powerful machine learning (ML) tools for integrating different data modalities into predictive models, this has rarely been approached so far from the perspective of uncovering the underlying, data-generating DS (aka DS reconstruction). Recently, sparse teacher forcing (TF) has been suggested as an efficient control-theoretic method for dealing with exploding loss gradients when training ML models on chaotic DS. Here we incorporate this idea into a novel recurrent neural network (RNN) training framework for DS reconstruction based on multimodal variational autoencoders (MVAE). The forcing signal for the RNN is generated by the MVAE which integrates different types of simultaneously given time series data into a joint latent code optimal for DS reconstruction. We show that this training method achieves significantly better reconstructions on multimodal datasets generated from chaotic DS benchmarks than various alternative methods.

* Published as a workshop paper for the AAAI 2023 Workshop MLmDS

Via

Access Paper or Ask Questions

Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems

Jul 06, 2022

Manuel Brenner, Florian Hess, Jonas M. Mikhaeil, Leonard Bereska, Zahra Monfared, Po-Chen Kuo, Daniel Durstewitz

Figure 1 for Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems

Figure 2 for Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems

Figure 3 for Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems

Figure 4 for Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems

Abstract:In many scientific disciplines, we are interested in inferring the nonlinear dynamical system underlying a set of observed time series, a challenging task in the face of chaotic behavior and noise. Previous deep learning approaches toward this goal often suffered from a lack of interpretability and tractability. In particular, the high-dimensional latent spaces often required for a faithful embedding, even when the underlying dynamics lives on a lower-dimensional manifold, can hamper theoretical analysis. Motivated by the emerging principles of dendritic computation, we augment a dynamically interpretable and mathematically tractable piecewise-linear (PL) recurrent neural network (RNN) by a linear spline basis expansion. We show that this approach retains all the theoretically appealing properties of the simple PLRNN, yet boosts its capacity for approximating arbitrary nonlinear dynamical systems in comparatively low dimensions. We employ two frameworks for training the system, one combining back-propagation-through-time (BPTT) with teacher forcing, and another based on fast and scalable variational inference. We show that the dendritically expanded PLRNN achieves better reconstructions with fewer parameters and dimensions on various dynamical systems benchmarks and compares favorably to other methods, while retaining a tractable and interpretable structure.

* To be published in the Proceedings of the 39th International Conference on Machine Learning (ICML 2022)

Via

Access Paper or Ask Questions