Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Romain Lacombe

Stanford University, Plume Labs

Accelerating the Generation of Molecular Conformations with Progressive Distillation of Equivariant Latent Diffusion Models

Apr 21, 2024

Romain Lacombe, Neal Vaidya

Abstract:Recent advances in fast sampling methods for diffusion models have demonstrated significant potential to accelerate generation on image modalities. We apply these methods to 3-dimensional molecular conformations by building on the recently introduced GeoLDM equivariant latent diffusion model (Xu et al., 2023). We evaluate trade-offs between speed gains and quality loss, as measured by molecular conformation structural stability. We introduce Equivariant Latent Progressive Distillation, a fast sampling algorithm that preserves geometric equivariance and accelerates generation from latent diffusion models. Our experiments demonstrate up to 7.5x gains in sampling speed with limited degradation in molecular stability. These results suggest this accelerated sampling method has strong potential for high-throughput in silico molecular conformations screening in computational biochemistry, drug discovery, and life sciences applications.

* Accepted at the Generative and Experimental Perspectives for Biomolecular Design Workshop at the 12th International Conference on Learning Representations, 2024

Via

Access Paper or Ask Questions

AdsorbRL: Deep Multi-Objective Reinforcement Learning for Inverse Catalysts Design

Dec 04, 2023

Romain Lacombe, Lucas Hendren, Khalid El-Awady

Abstract:A central challenge of the clean energy transition is the development of catalysts for low-emissions technologies. Recent advances in Machine Learning for quantum chemistry drastically accelerate the computation of catalytic activity descriptors such as adsorption energies. Here we introduce AdsorbRL, a Deep Reinforcement Learning agent aiming to identify potential catalysts given a multi-objective binding energy target, trained using offline learning on the Open Catalyst 2020 and Materials Project data sets. We experiment with Deep Q-Network agents to traverse the space of all ~160,000 possible unary, binary and ternary compounds of 55 chemical elements, with very sparse rewards based on adsorption energy known for only between 2,000 and 3,000 catalysts per adsorbate. To constrain the actions space, we introduce Random Edge Traversal and train a single-objective DQN agent on the known states subgraph, which we find strengthens target binding energy by an average of 4.1 eV. We extend this approach to multi-objective, goal-conditioned learning, and train a DQN agent to identify materials with the highest (respectively lowest) adsorption energies for multiple simultaneous target adsorbates. We experiment with Objective Sub-Sampling, a novel training scheme aimed at encouraging exploration in the multi-objective setup, and demonstrate simultaneous adsorption energy improvement across all target adsorbates, by an average of 0.8 eV. Overall, our results suggest strong potential for Deep Reinforcement Learning applied to the inverse catalysts design problem.

* 37th Conference on Neural Information Processing Systems (NeurIPS 2023), AI for Accelerated Materials Design Workshop

Via

Access Paper or Ask Questions

ClimateX: Do LLMs Accurately Assess Human Expert Confidence in Climate Statements?

Nov 28, 2023

Romain Lacombe, Kerrie Wu, Eddie Dilworth

Abstract:Evaluating the accuracy of outputs generated by Large Language Models (LLMs) is especially important in the climate science and policy domain. We introduce the Expert Confidence in Climate Statements (ClimateX) dataset, a novel, curated, expert-labeled dataset consisting of 8094 climate statements collected from the latest Intergovernmental Panel on Climate Change (IPCC) reports, labeled with their associated confidence levels. Using this dataset, we show that recent LLMs can classify human expert confidence in climate-related statements, especially in a few-shot learning setting, but with limited (up to 47%) accuracy. Overall, models exhibit consistent and significant over-confidence on low and medium confidence statements. We highlight implications of our results for climate communication, LLMs evaluation strategies, and the use of LLMs in information retrieval systems.

* Tackling Climate Change with Machine Learning workshop at NeurIPS 2023

Via

Access Paper or Ask Questions

Extracting Molecular Properties from Natural Language with Multimodal Contrastive Learning

Jul 22, 2023

Romain Lacombe, Andrew Gaut, Jeff He, David Lüdeke, Kateryna Pistunova

Figure 1 for Extracting Molecular Properties from Natural Language with Multimodal Contrastive Learning

Figure 2 for Extracting Molecular Properties from Natural Language with Multimodal Contrastive Learning

Figure 3 for Extracting Molecular Properties from Natural Language with Multimodal Contrastive Learning

Figure 4 for Extracting Molecular Properties from Natural Language with Multimodal Contrastive Learning

Abstract:Deep learning in computational biochemistry has traditionally focused on molecular graphs neural representations; however, recent advances in language models highlight how much scientific knowledge is encoded in text. To bridge these two modalities, we investigate how molecular property information can be transferred from natural language to graph representations. We study property prediction performance gains after using contrastive learning to align neural graph representations with representations of textual descriptions of their characteristics. We implement neural relevance scoring strategies to improve text retrieval, introduce a novel chemically-valid molecular graph augmentation strategy inspired by organic reactions, and demonstrate improved performance on downstream MoleculeNet property classification tasks. We achieve a +4.26% AUROC gain versus models pre-trained on the graph modality alone, and a +1.54% gain compared to recently proposed molecular graph/text contrastively trained MoMu model (Su et al. 2022).

* 2023 ICML Workshop on Computational Biology

Via

Access Paper or Ask Questions

Improving extreme weather events detection with light-weight neural networks

Mar 31, 2023

Romain Lacombe, Hannah Grossman, Lucas Hendren, David Lüdeke

Figure 1 for Improving extreme weather events detection with light-weight neural networks

Figure 2 for Improving extreme weather events detection with light-weight neural networks

Figure 3 for Improving extreme weather events detection with light-weight neural networks

Figure 4 for Improving extreme weather events detection with light-weight neural networks

Abstract:To advance automated detection of extreme weather events, which are increasing in frequency and intensity with climate change, we explore modifications to a novel light-weight Context Guided convolutional neural network architecture trained for semantic segmentation of tropical cyclones and atmospheric rivers in climate data. Our primary focus is on tropical cyclones, the most destructive weather events, for which current models show limited performance. We investigate feature engineering, data augmentation, learning rate modifications, alternative loss functions, and architectural changes. In contrast to previous approaches optimizing for intersection over union, we specifically seek to improve recall to penalize under-counting and prioritize identification of tropical cyclones. We report success through the use of weighted loss functions to counter class imbalance for these rare events. We conclude with directions for future research on extreme weather events detection, a crucial task for prediction, mitigation, and equitable adaptation to the impacts of climate change.

* Published as a workshop paper at 'Tackling Climate Change with Machine Learning', ICLR 2023

Via

Access Paper or Ask Questions