Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Oliver T. Unke

How simple can you go? An off-the-shelf transformer approach to molecular dynamics

Mar 05, 2025

Max Eissler, Tim Korjakow, Stefan Ganscha, Oliver T. Unke, Klaus-Robert Müller, Stefan Gugler

Abstract:Most current neural networks for molecular dynamics (MD) include physical inductive biases, resulting in specialized and complex architectures. This is in contrast to most other machine learning domains, where specialist approaches are increasingly replaced by general-purpose architectures trained on vast datasets. In line with this trend, several recent studies have questioned the necessity of architectural features commonly found in MD models, such as built-in rotational equivariance or energy conservation. In this work, we contribute to the ongoing discussion by evaluating the performance of an MD model with as few specialized architectural features as possible. We present a recipe for MD using an Edge Transformer, an "off-the-shelf'' transformer architecture that has been minimally modified for the MD domain, termed MD-ET. Our model implements neither built-in equivariance nor energy conservation. We use a simple supervised pre-training scheme on $\sim$30 million molecular structures from the QCML database. Using this "off-the-shelf'' approach, we show state-of-the-art results on several benchmarks after fine-tuning for a small number of steps. Additionally, we examine the effects of being only approximately equivariant and energy conserving for MD simulations, proposing a novel method for distinguishing the errors resulting from non-equivariance from other sources of inaccuracies like numerical rounding errors. While our model exhibits runaway energy increases on larger structures, we show approximately energy-conserving NVE simulations for a range of small structures.

* 21 pages, code at https://github.com/mx-e/simple-md

Via

Access Paper or Ask Questions

Enhancing Diffusion Models Efficiency by Disentangling Total-Variance and Signal-to-Noise Ratio

Feb 12, 2025

Khaled Kahouli, Winfried Ripken, Stefan Gugler, Oliver T. Unke, Klaus-Robert Müller, Shinichi Nakajima

Abstract:The long sampling time of diffusion models remains a significant bottleneck, which can be mitigated by reducing the number of diffusion time steps. However, the quality of samples with fewer steps is highly dependent on the noise schedule, i.e., the specific manner in which noise is introduced and the signal is reduced at each step. Although prior work has improved upon the original variance-preserving and variance-exploding schedules, these approaches $\textit{passively}$ adjust the total variance, without direct control over it. In this work, we propose a novel total-variance/signal-to-noise-ratio disentangled (TV/SNR) framework, where TV and SNR can be controlled independently. Our approach reveals that different existing schedules, where the TV explodes exponentially, can be $\textit{improved}$ by setting a constant TV schedule while preserving the same SNR schedule. Furthermore, generalizing the SNR schedule of the optimal transport flow matching significantly improves the performance in molecular structure generation, achieving few step generation of stable molecules. A similar tendency is observed in image generation, where our approach with a uniform diffusion time grid performs comparably to the highly tailored EDM sampler.

Via

Access Paper or Ask Questions

Euclidean Fast Attention: Machine Learning Global Atomic Representations at Linear Cost

Dec 11, 2024

J. Thorben Frank, Stefan Chmiela, Klaus-Robert Müller, Oliver T. Unke

Abstract:Long-range correlations are essential across numerous machine learning tasks, especially for data embedded in Euclidean space, where the relative positions and orientations of distant components are often critical for accurate predictions. Self-attention offers a compelling mechanism for capturing these global effects, but its quadratic complexity presents a significant practical limitation. This problem is particularly pronounced in computational chemistry, where the stringent efficiency requirements of machine learning force fields (MLFFs) often preclude accurately modeling long-range interactions. To address this, we introduce Euclidean fast attention (EFA), a linear-scaling attention-like mechanism designed for Euclidean data, which can be easily incorporated into existing model architectures. A core component of EFA are novel Euclidean rotary positional encodings (ERoPE), which enable efficient encoding of spatial information while respecting essential physical symmetries. We empirically demonstrate that EFA effectively captures diverse long-range effects, enabling EFA-equipped MLFFs to describe challenging chemical interactions for which conventional MLFFs yield incorrect results.

Via

Access Paper or Ask Questions

Complete and Efficient Covariants for 3D Point Configurations with Application to Learning Molecular Quantum Properties

Sep 04, 2024

Hartmut Maennel, Oliver T. Unke, Klaus-Robert Müller

Figure 1 for Complete and Efficient Covariants for 3D Point Configurations with Application to Learning Molecular Quantum Properties

Figure 2 for Complete and Efficient Covariants for 3D Point Configurations with Application to Learning Molecular Quantum Properties

Figure 3 for Complete and Efficient Covariants for 3D Point Configurations with Application to Learning Molecular Quantum Properties

Figure 4 for Complete and Efficient Covariants for 3D Point Configurations with Application to Learning Molecular Quantum Properties

Abstract:When modeling physical properties of molecules with machine learning, it is desirable to incorporate $SO(3)$-covariance. While such models based on low body order features are not complete, we formulate and prove general completeness properties for higher order methods, and show that $6k-5$ of these features are enough for up to $k$ atoms. We also find that the Clebsch--Gordan operations commonly used in these methods can be replaced by matrix multiplications without sacrificing completeness, lowering the scaling from $O(l^6)$ to $O(l^3)$ in the degree of the features. We apply this to quantum chemistry, but the proposed methods are generally applicable for problems involving 3D point configurations.

Via

Access Paper or Ask Questions

E3x: $\mathrm{E}$-Equivariant Deep Learning Made Easy

Jan 17, 2024

Oliver T. Unke, Hartmut Maennel

Abstract:This work introduces E3x, a software package for building neural networks that are equivariant with respect to the Euclidean group $\mathrm{E}(3)$, consisting of translations, rotations, and reflections of three-dimensional space. Compared to ordinary neural networks, $\mathrm{E}(3)$-equivariant models promise benefits whenever input and/or output data are quantities associated with three-dimensional objects. This is because the numeric values of such quantities (e.g. positions) typically depend on the chosen coordinate system. Under transformations of the reference frame, the values change predictably, but the underlying rules can be difficult to learn for ordinary machine learning models. With built-in $\mathrm{E}(3)$-equivariance, neural networks are guaranteed to satisfy the relevant transformation rules exactly, resulting in superior data efficiency and accuracy. The code for E3x is available from https://github.com/google-research/e3x, detailed documentation and usage examples can be found on https://e3x.readthedocs.io.

Via

Access Paper or Ask Questions

From Peptides to Nanostructures: A Euclidean Transformer for Fast and Stable Machine Learned Force Fields

Sep 21, 2023

J. Thorben Frank, Oliver T. Unke, Klaus-Robert Müller, Stefan Chmiela

Abstract:Recent years have seen vast progress in the development of machine learned force fields (MLFFs) based on ab-initio reference calculations. Despite achieving low test errors, the suitability of MLFFs in molecular dynamics (MD) simulations is being increasingly scrutinized due to concerns about instability. Our findings suggest a potential connection between MD simulation stability and the presence of equivariant representations in MLFFs, but their computational cost can limit practical advantages they would otherwise bring. To address this, we propose a transformer architecture called SO3krates that combines sparse equivariant representations (Euclidean variables) with a self-attention mechanism that can separate invariant and equivariant information, eliminating the need for expensive tensor products. SO3krates achieves a unique combination of accuracy, stability, and speed that enables insightful analysis of quantum properties of matter on unprecedented time and system size scales. To showcase this capability, we generate stable MD trajectories for flexible peptides and supra-molecular structures with hundreds of atoms. Furthermore, we investigate the PES topology for medium-sized chainlike molecules (e.g., small peptides) by exploring thousands of minima. Remarkably, SO3krates demonstrates the ability to strike a balance between the conflicting demands of stability and the emergence of new minimum-energy conformations beyond the training data, which is crucial for realistic exploration tasks in the field of biochemistry.

Via

Access Paper or Ask Questions

So3krates -- Self-attention for higher-order geometric interactions on arbitrary length-scales

May 28, 2022

J. Thorben Frank, Oliver T. Unke, Klaus-Robert Müller

Figure 1 for So3krates -- Self-attention for higher-order geometric interactions on arbitrary length-scales

Figure 2 for So3krates -- Self-attention for higher-order geometric interactions on arbitrary length-scales

Figure 3 for So3krates -- Self-attention for higher-order geometric interactions on arbitrary length-scales

Figure 4 for So3krates -- Self-attention for higher-order geometric interactions on arbitrary length-scales

Abstract:The application of machine learning methods in quantum chemistry has enabled the study of numerous chemical phenomena, which are computationally intractable with traditional ab-initio methods. However, some quantum mechanical properties of molecules and materials depend on non-local electronic effects, which are often neglected due to the difficulty of modeling them efficiently. This work proposes a modified attention mechanism adapted to the underlying physics, which allows to recover the relevant non-local effects. Namely, we introduce spherical harmonic coordinates (SPHCs) to reflect higher-order geometric information for each atom in a molecule, enabling a non-local formulation of attention in the SPHC space. Our proposed model So3krates -- a self-attention based message passing neural network -- uncouples geometric information from atomic features, making them independently amenable to attention mechanisms. We show that in contrast to other published methods, So3krates is able to describe non-local quantum mechanical effects over arbitrary length scales. Further, we find evidence that the inclusion of higher-order geometric correlations increases data efficiency and improves generalization. So3krates matches or exceeds state-of-the-art performance on popular benchmarks, notably, requiring a significantly lower number of parameters (0.25--0.4x) while at the same time giving a substantial speedup (6--14x for training and 2--11x for inference) compared to other models.

Via

Access Paper or Ask Questions

Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations

May 17, 2022

Oliver T. Unke, Martin Stöhr, Stefan Ganscha, Thomas Unterthiner, Hartmut Maennel, Sergii Kashubin, Daniel Ahlin, Michael Gastegger, Leonardo Medrano Sandonas, Alexandre Tkatchenko(+1 more)

Figure 1 for Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations

Figure 2 for Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations

Figure 3 for Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations

Figure 4 for Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations

Abstract:Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes. Accurate MD simulations require computationally demanding quantum-mechanical calculations, being practically limited to short timescales and few atoms. For larger systems, efficient, but much less reliable empirical force fields are used. Recently, machine learned force fields (MLFFs) emerged as an alternative means to execute MD simulations, offering similar accuracy as ab initio methods at orders-of-magnitude speedup. Until now, MLFFs mainly capture short-range interactions in small molecules or periodic materials, due to the increased complexity of constructing models and obtaining reliable reference data for large molecules, where long-ranged many-body effects become important. This work proposes a general approach to constructing accurate MLFFs for large-scale molecular simulations (GEMS) by training on "bottom-up" and "top-down" molecular fragments of varying size, from which the relevant physicochemical interactions can be learned. GEMS is applied to study the dynamics of alanine-based peptides and the 46-residue protein crambin in aqueous solution, allowing nanosecond-scale MD simulations of >25k atoms at essentially ab initio quality. Our findings suggest that structural motifs in peptides and proteins are more flexible than previously thought, indicating that simulations at ab initio accuracy might be necessary to understand dynamic biomolecular processes such as protein (mis)folding, drug-protein binding, or allosteric regulation.

Via

Access Paper or Ask Questions

Automatic Identification of Chemical Moieties

Mar 30, 2022

Jonas Lederer, Michael Gastegger, Kristof T. Schütt, Michael Kampffmeyer, Klaus-Robert Müller, Oliver T. Unke

Figure 1 for Automatic Identification of Chemical Moieties

Figure 2 for Automatic Identification of Chemical Moieties

Figure 3 for Automatic Identification of Chemical Moieties

Figure 4 for Automatic Identification of Chemical Moieties

Abstract:In recent years, the prediction of quantum mechanical observables with machine learning methods has become increasingly popular. Message-passing neural networks (MPNNs) solve this task by constructing atomic representations, from which the properties of interest are predicted. Here, we introduce a method to automatically identify chemical moieties (molecular building blocks) from such representations, enabling a variety of applications beyond property prediction, which otherwise rely on expert knowledge. The required representation can either be provided by a pretrained MPNN, or learned from scratch using only structural information. Beyond the data-driven design of molecular fingerprints, the versatility of our approach is demonstrated by enabling the selection of representative entries in chemical databases, the automatic construction of coarse-grained force fields, as well as the identification of reaction coordinates.

Via

Access Paper or Ask Questions

SE(3)-equivariant prediction of molecular wavefunctions and electronic densities

Jun 04, 2021

Oliver T. Unke, Mihail Bogojeski, Michael Gastegger, Mario Geiger, Tess Smidt, Klaus-Robert Müller

Figure 1 for SE(3)-equivariant prediction of molecular wavefunctions and electronic densities

Figure 2 for SE(3)-equivariant prediction of molecular wavefunctions and electronic densities

Abstract:Machine learning has enabled the prediction of quantum chemical properties with high accuracy and efficiency, allowing to bypass computationally costly ab initio calculations. Instead of training on a fixed set of properties, more recent approaches attempt to learn the electronic wavefunction (or density) as a central quantity of atomistic systems, from which all other observables can be derived. This is complicated by the fact that wavefunctions transform non-trivially under molecular rotations, which makes them a challenging prediction target. To solve this issue, we introduce general SE(3)-equivariant operations and building blocks for constructing deep learning architectures for geometric point cloud data and apply them to reconstruct wavefunctions of atomistic systems with unprecedented accuracy. Our model reduces prediction errors by up to two orders of magnitude compared to the previous state-of-the-art and makes it possible to derive properties such as energies and forces directly from the wavefunction in an end-to-end manner. We demonstrate the potential of our approach in a transfer learning application, where a model trained on low accuracy reference wavefunctions implicitly learns to correct for electronic many-body interactions from observables computed at a higher level of theory. Such machine-learned wavefunction surrogates pave the way towards novel semi-empirical methods, offering resolution at an electronic level while drastically decreasing computational cost. While we focus on physics applications in this contribution, the proposed equivariant framework for deep learning on point clouds is promising also beyond, say, in computer vision or graphics.

Via

Access Paper or Ask Questions