Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Philipp Grohs

Accurate Ab-initio Neural-network Solutions to Large-Scale Electronic Structure Problems

Apr 08, 2025

Michael Scherbela, Nicholas Gao, Philipp Grohs, Stephan Günnemann

Abstract:We present finite-range embeddings (FiRE), a novel wave function ansatz for accurate large-scale ab-initio electronic structure calculations. Compared to contemporary neural-network wave functions, FiRE reduces the asymptotic complexity of neural-network variational Monte Carlo (NN-VMC) by $\sim n_\text{el}$, the number of electrons. By restricting electron-electron interactions within the neural network, FiRE accelerates all key operations -- sampling, pseudopotentials, and Laplacian computations -- resulting in a real-world $10\times$ acceleration in now-feasible 180-electron calculations. We validate our method's accuracy on various challenging systems, including biochemical compounds, conjugated hydrocarbons, and organometallic compounds. On these systems, FiRE's energies are consistently within chemical accuracy of the most reliable data, including experiments, even in cases where high-accuracy methods such as CCSD(T), AFQMC, or contemporary NN-VMC fall short. With these improvements in both runtime and accuracy, FiRE represents a new `gold-standard' method for fast and accurate large-scale ab-initio calculations, potentially enabling new computational studies in fields like quantum chemistry, solid-state physics, and material design.

* 13 pages, 5 figures + 9 pages supplementary information

Via

Access Paper or Ask Questions

The sampling complexity of learning invertible residual neural networks

Nov 08, 2024

Yuanyuan Li, Philipp Grohs, Philipp Petersen

Abstract:In recent work it has been shown that determining a feedforward ReLU neural network to within high uniform accuracy from point samples suffers from the curse of dimensionality in terms of the number of samples needed. As a consequence, feedforward ReLU neural networks are of limited use for applications where guaranteed high uniform accuracy is required. We consider the question of whether the sampling complexity can be improved by restricting the specific neural network architecture. To this end, we investigate invertible residual neural networks which are foundational architectures in deep learning and are widely employed in models that power modern generative methods. Our main result shows that the residual neural network architecture and invertibility do not help overcome the complexity barriers encountered with simpler feedforward architectures. Specifically, we demonstrate that the computational complexity of approximating invertible residual neural networks from point samples in the uniform norm suffers from the curse of dimensionality. Similar results are established for invertible convolutional Residual neural networks.

Via

Access Paper or Ask Questions

Transferable Neural Wavefunctions for Solids

May 13, 2024

Leon Gerard, Michael Scherbela, Halvard Sutterud, Matthew Foulkes, Philipp Grohs

Figure 1 for Transferable Neural Wavefunctions for Solids

Figure 2 for Transferable Neural Wavefunctions for Solids

Figure 3 for Transferable Neural Wavefunctions for Solids

Figure 4 for Transferable Neural Wavefunctions for Solids

Abstract:Deep-Learning-based Variational Monte Carlo (DL-VMC) has recently emerged as a highly accurate approach for finding approximate solutions to the many-electron Schr\"odinger equation. Despite its favorable scaling with the number of electrons, $\mathcal{O}(n_\text{el}^{4})$, the practical value of DL-VMC is limited by the high cost of optimizing the neural network weights for every system studied. To mitigate this problem, recent research has proposed optimizing a single neural network across multiple systems, reducing the cost per system. Here we extend this approach to solids, where similar but distinct calculations using different geometries, boundary conditions, and supercell sizes are often required. We show how to optimize a single ansatz across all of these variations, reducing the required number of optimization steps by an order of magnitude. Furthermore, we exploit the transfer capabilities of a pre-trained network. We successfully transfer a network, pre-trained on 2x2x2 supercells of LiH, to 3x3x3 supercells. This reduces the number of optimization steps required to simulate the large system by a factor of 50 compared to previous work.

* 15 pages, 3 figures, + supplementary information

Via

Access Paper or Ask Questions

Sampling Complexity of Deep Approximation Spaces

Dec 20, 2023

Ahmed Abdeljawad, Philipp Grohs

Abstract:While it is well-known that neural networks enjoy excellent approximation capabilities, it remains a big challenge to compute such approximations from point samples. Based on tools from Information-based complexity, recent work by Grohs and Voigtlaender [Journal of the FoCM (2023)] developed a rigorous framework for assessing this so-called "theory-to-practice gap". More precisely, in that work it is shown that there exist functions that can be approximated by neural networks with ReLU activation function at an arbitrary rate while requiring an exponentially growing (in the input dimension) number of samples for their numerical computation. The present study extends these findings by showing analogous results for the ReQU activation function.

Via

Access Paper or Ask Questions

Variational Monte Carlo on a Budget -- Fine-tuning pre-trained Neural Wavefunctions

Jul 15, 2023

Michael Scherbela, Leon Gerard, Philipp Grohs

Figure 1 for Variational Monte Carlo on a Budget -- Fine-tuning pre-trained Neural Wavefunctions

Figure 2 for Variational Monte Carlo on a Budget -- Fine-tuning pre-trained Neural Wavefunctions

Figure 3 for Variational Monte Carlo on a Budget -- Fine-tuning pre-trained Neural Wavefunctions

Figure 4 for Variational Monte Carlo on a Budget -- Fine-tuning pre-trained Neural Wavefunctions

Abstract:Obtaining accurate solutions to the Schr\"odinger equation is the key challenge in computational quantum chemistry. Deep-learning-based Variational Monte Carlo (DL-VMC) has recently outperformed conventional approaches in terms of accuracy, but only at large computational cost. Whereas in many domains models are trained once and subsequently applied for inference, accurate DL-VMC so far requires a full optimization for every new problem instance, consuming thousands of GPUhs even for small molecules. We instead propose a DL-VMC model which has been pre-trained using self-supervised wavefunction optimization on a large and chemically diverse set of molecules. Applying this model to new molecules without any optimization, yields wavefunctions and absolute energies that outperform established methods such as CCSD(T)-2Z. To obtain accurate relative energies, only few fine-tuning steps of this base model are required. We accomplish this with a fully end-to-end machine-learned model, consisting of an improved geometry embedding architecture and an existing SE(3)-equivariant model to represent molecular orbitals. Combining this architecture with continuous sampling of geometries, we improve zero-shot accuracy by two orders of magnitude compared to the state of the art. We extensively evaluate the accuracy, scalability and limitations of our base model on a wide variety of test systems.

Via

Access Paper or Ask Questions

FakET: Simulating Cryo-Electron Tomograms with Neural Style Transfer

Apr 04, 2023

Pavol Harar, Lukas Herrmann, Philipp Grohs, David Haselbach

Abstract:Particle localization and -classification constitute two of the most fundamental problems in computational microscopy. In recent years, deep learning based approaches have been introduced for these tasks with great success. A key shortcoming of these supervised learning methods is their need for large training data sets, typically generated from particle models in conjunction with complex numerical forward models simulating the physics of transmission electron microscopes. Computer implementations of such forward models are computationally extremely demanding and limit the scope of their applicability. In this paper we propose a simple method for simulating the forward operator of an electron microscope based on additive noise and Neural Style Transfer techniques. We evaluate the method on localization and classification tasks using one of the established state-of-the-art architectures showing performance on par with the benchmark. In contrast to previous approaches, our method accelerates the data generation process by a factor of 750 while using 33 times less memory and scales well to typical transmission electron microscope detector sizes. It utilizes GPU acceleration and parallel processing. It can be used as a stand-alone method to adapt a training data set or as a data augmentation technique. The source code is available at https://gitlab.com/deepet/faket.

* 1 table, 16 figures

Via

Access Paper or Ask Questions

Towards a Foundation Model for Neural Network Wavefunctions

Mar 17, 2023

Michael Scherbela, Leon Gerard, Philipp Grohs

Abstract:Deep neural networks have become a highly accurate and powerful wavefunction ansatz in combination with variational Monte Carlo methods for solving the electronic Schr\"odinger equation. However, despite their success and favorable scaling, these methods are still computationally too costly for wide adoption. A significant obstacle is the requirement to optimize the wavefunction from scratch for each new system, thus requiring long optimization. In this work, we propose a novel neural network ansatz, which effectively maps uncorrelated, computationally cheap Hartree-Fock orbitals, to correlated, high-accuracy neural network orbitals. This ansatz is inherently capable of learning a single wavefunction across multiple compounds and geometries, as we demonstrate by successfully transferring a wavefunction model pre-trained on smaller fragments to larger compounds. Furthermore, we provide ample experimental evidence to support the idea that extensive pre-training of a such a generalized wavefunction model across different compounds and geometries could lead to a foundation wavefunction model. Such a model could yield high-accuracy ab-initio energies using only minimal computational effort for fine-tuning and evaluation of observables.

Via

Access Paper or Ask Questions

Gold-standard solutions to the Schrödinger equation using deep learning: How much physics do we need?

May 31, 2022

Leon Gerard, Michael Scherbela, Philipp Marquetand, Philipp Grohs

Figure 1 for Gold-standard solutions to the Schrödinger equation using deep learning: How much physics do we need?

Figure 2 for Gold-standard solutions to the Schrödinger equation using deep learning: How much physics do we need?

Figure 3 for Gold-standard solutions to the Schrödinger equation using deep learning: How much physics do we need?

Figure 4 for Gold-standard solutions to the Schrödinger equation using deep learning: How much physics do we need?

Abstract:Finding accurate solutions to the Schr\"odinger equation is the key unsolved challenge of computational chemistry. Given its importance for the development of new chemical compounds, decades of research have been dedicated to this problem, but due to the large dimensionality even the best available methods do not yet reach the desired accuracy. Recently the combination of deep learning with Monte Carlo methods has emerged as a promising way to obtain highly accurate energies and moderate scaling of computational cost. In this paper we significantly contribute towards this goal by introducing a novel deep-learning architecture that achieves 40-70% lower energy error at 8x lower computational cost compared to previous approaches. Using our method we establish a new benchmark by calculating the most accurate variational ground state energies ever published for a number of different atoms and molecules. We systematically break down and measure our improvements, focusing in particular on the effect of increasing physical prior knowledge. We surprisingly find that increasing the prior knowledge given to the architecture can actually decrease accuracy.

* 10 pages + apppendix, 7 figures; V2: minor corrections to citations and reference energies for F, Ne, H2O

Via

Access Paper or Ask Questions

Training ReLU networks to high uniform accuracy is intractable

May 26, 2022

Julius Berner, Philipp Grohs, Felix Voigtlaender

Figure 1 for Training ReLU networks to high uniform accuracy is intractable

Figure 2 for Training ReLU networks to high uniform accuracy is intractable

Figure 3 for Training ReLU networks to high uniform accuracy is intractable

Figure 4 for Training ReLU networks to high uniform accuracy is intractable

Abstract:Statistical learning theory provides bounds on the necessary number of training samples needed to reach a prescribed accuracy in a learning problem formulated over a given target class. This accuracy is typically measured in terms of a generalization error, that is, an expected value of a given loss function. However, for several applications -- for example in a security-critical context or for problems in the computational sciences -- accuracy in this sense is not sufficient. In such cases, one would like to have guarantees for high accuracy on every input value, that is, with respect to the uniform norm. In this paper we precisely quantify the number of training samples needed for any conceivable training algorithm to guarantee a given uniform accuracy on any learning problem formulated over target classes containing (or consisting of) ReLU neural networks of a prescribed architecture. We prove that, under very general assumptions, the minimal number of training samples for this task scales exponentially both in the depth and the input dimension of the network architecture. As a corollary we conclude that the training of ReLU neural networks to high uniform accuracy is intractable. In a security-critical context this points to the fact that deep learning based systems are prone to being fooled by a possible adversary. We corroborate our theoretical findings by numerical results.

Via

Access Paper or Ask Questions

Integral representations of shallow neural network with Rectified Power Unit activation function

Dec 20, 2021

Ahmed Abdeljawad, Philipp Grohs

Abstract:In this effort, we derive a formula for the integral representation of a shallow neural network with the Rectified Power Unit activation function. Mainly, our first result deals with the univariate case of representation capability of RePU shallow networks. The multidimensional result in this paper characterizes the set of functions that can be represented with bounded norm and possibly unbounded width.

* 22 pages, This is the first version. Some revisions in the near future is expected to be performed. arXiv admin note: text overlap with arXiv:1910.01635 by other authors

Via

Access Paper or Ask Questions