Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mohamed Amine Ketata

Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space

Jun 15, 2024

Mohamed Amine Ketata, Nicholas Gao, Johanna Sommer, Tom Wollschläger, Stephan Günnemann

Figure 1 for Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space

Figure 2 for Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space

Figure 3 for Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space

Figure 4 for Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space

Abstract:We introduce a new framework for molecular graph generation with 3D molecular generative models. Our Synthetic Coordinate Embedding (SyCo) framework maps molecular graphs to Euclidean point clouds via synthetic conformer coordinates and learns the inverse map using an E(n)-Equivariant Graph Neural Network (EGNN). The induced point cloud-structured latent space is well-suited to apply existing 3D molecular generative models. This approach simplifies the graph generation problem - without relying on molecular fragments nor autoregressive decoding - into a point cloud generation problem followed by node and edge classification tasks. Further, we propose a novel similarity-constrained optimization scheme for 3D diffusion models based on inpainting and guidance. As a concrete implementation of our framework, we develop EDM-SyCo based on the E(3) Equivariant Diffusion Model (EDM). EDM-SyCo achieves state-of-the-art performance in distribution learning of molecular graphs, outperforming the best non-autoregressive methods by more than 30% on ZINC250K and 16% on the large-scale GuacaMol dataset while improving conditional generation by up to 3.9 times.

Via

Access Paper or Ask Questions

Uncertainty Estimation for Molecules: Desiderata and Methods

Jun 20, 2023

Tom Wollschläger, Nicholas Gao, Bertrand Charpentier, Mohamed Amine Ketata, Stephan Günnemann

Figure 1 for Uncertainty Estimation for Molecules: Desiderata and Methods

Figure 2 for Uncertainty Estimation for Molecules: Desiderata and Methods

Figure 3 for Uncertainty Estimation for Molecules: Desiderata and Methods

Figure 4 for Uncertainty Estimation for Molecules: Desiderata and Methods

Abstract:Graph Neural Networks (GNNs) are promising surrogates for quantum mechanical calculations as they establish unprecedented low errors on collections of molecular dynamics (MD) trajectories. Thanks to their fast inference times they promise to accelerate computational chemistry applications. Unfortunately, despite low in-distribution (ID) errors, such GNNs might be horribly wrong for out-of-distribution (OOD) samples. Uncertainty estimation (UE) may aid in such situations by communicating the model's certainty about its prediction. Here, we take a closer look at the problem and identify six key desiderata for UE in molecular force fields, three 'physics-informed' and three 'application-focused' ones. To overview the field, we survey existing methods from the field of UE and analyze how they fit to the set desiderata. By our analysis, we conclude that none of the previous works satisfies all criteria. To fill this gap, we propose Localized Neural Kernel (LNK) a Gaussian Process (GP)-based extension to existing GNNs satisfying the desiderata. In our extensive experimental evaluation, we test four different UE with three different backbones and two datasets. In out-of-equilibrium detection, we find LNK yielding up to 2.5 and 2.1 times lower errors in terms of AUC-ROC score than dropout or evidential regression-based methods while maintaining high predictive performance.

* Published as conference paper at ICML 2023

Via

Access Paper or Ask Questions

DiffDock-PP: Rigid Protein-Protein Docking with Diffusion Models

Apr 08, 2023

Mohamed Amine Ketata, Cedrik Laue, Ruslan Mammadov, Hannes Stärk, Menghua Wu, Gabriele Corso, Céline Marquet, Regina Barzilay, Tommi S. Jaakkola

Figure 1 for DiffDock-PP: Rigid Protein-Protein Docking with Diffusion Models

Figure 2 for DiffDock-PP: Rigid Protein-Protein Docking with Diffusion Models

Figure 3 for DiffDock-PP: Rigid Protein-Protein Docking with Diffusion Models

Figure 4 for DiffDock-PP: Rigid Protein-Protein Docking with Diffusion Models

Abstract:Understanding how proteins structurally interact is crucial to modern biology, with applications in drug discovery and protein design. Recent machine learning methods have formulated protein-small molecule docking as a generative problem with significant performance boosts over both traditional and deep learning baselines. In this work, we propose a similar approach for rigid protein-protein docking: DiffDock-PP is a diffusion generative model that learns to translate and rotate unbound protein structures into their bound conformations. We achieve state-of-the-art performance on DIPS with a median C-RMSD of 4.85, outperforming all considered baselines. Additionally, DiffDock-PP is faster than all search-based methods and generates reliable confidence estimates for its predictions. Our code is publicly available at $\texttt{https://github.com/ketatam/DiffDock-PP}$

* ICLR Machine Learning for Drug Discovery (MLDD) Workshop 2023

Via

Access Paper or Ask Questions