Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gabe Gomes

Spectral Manifold Harmonization for Graph Imbalanced Regression

Jul 01, 2025

Brenda Nogueira, Gabe Gomes, Meng Jiang, Nitesh V. Chawla, Nuno Moniz

Abstract:Graph-structured data is ubiquitous in scientific domains, where models often face imbalanced learning settings. In imbalanced regression, domain preferences focus on specific target value ranges representing the most scientifically valuable cases; we observe a significant lack of research. In this paper, we present Spectral Manifold Harmonization (SMH), a novel approach for addressing this imbalanced regression challenge on graph-structured data by generating synthetic graph samples that preserve topological properties while focusing on often underrepresented target distribution regions. Conventional methods fail in this context because they either ignore graph topology in case generation or do not target specific domain ranges, resulting in models biased toward average target values. Experimental results demonstrate the potential of SMH on chemistry and drug discovery benchmark datasets, showing consistent improvements in predictive performance for target domain ranges.

Via

Access Paper or Ask Questions

Advancing Molecular Machine (Learned) Representations with Stereoelectronics-Infused Molecular Graphs

Aug 08, 2024

Daniil A. Boiko, Thiago Reschützegger, Benjamin Sanchez-Lengeling, Samuel M. Blau, Gabe Gomes

Abstract:Molecular representation is a foundational element in our understanding of the physical world. Its importance ranges from the fundamentals of chemical reactions to the design of new therapies and materials. Previous molecular machine learning models have employed strings, fingerprints, global features, and simple molecular graphs that are inherently information-sparse representations. However, as the complexity of prediction tasks increases, the molecular representation needs to encode higher fidelity information. This work introduces a novel approach to infusing quantum-chemical-rich information into molecular graphs via stereoelectronic effects. We show that the explicit addition of stereoelectronic interactions significantly improves the performance of molecular machine learning models. Furthermore, stereoelectronics-infused representations can be learned and deployed with a tailored double graph neural network workflow, enabling its application to any downstream molecular machine learning task. Finally, we show that the learned representations allow for facile stereoelectronic evaluation of previously intractable systems, such as entire proteins, opening new avenues of molecular design.

* 23 pages, 6 figures

Via

Access Paper or Ask Questions

Emergent autonomous scientific research capabilities of large language models

Apr 11, 2023

Daniil A. Boiko, Robert MacKnight, Gabe Gomes

Figure 1 for Emergent autonomous scientific research capabilities of large language models

Figure 2 for Emergent autonomous scientific research capabilities of large language models

Figure 3 for Emergent autonomous scientific research capabilities of large language models

Figure 4 for Emergent autonomous scientific research capabilities of large language models

Abstract:Transformer-based large language models are rapidly advancing in the field of machine learning research, with applications spanning natural language, biology, chemistry, and computer programming. Extreme scaling and reinforcement learning from human feedback have significantly improved the quality of generated text, enabling these models to perform various tasks and reason about their choices. In this paper, we present an Intelligent Agent system that combines multiple large language models for autonomous design, planning, and execution of scientific experiments. We showcase the Agent's scientific research capabilities with three distinct examples, with the most complex being the successful performance of catalyzed cross-coupling reactions. Finally, we discuss the safety implications of such systems and propose measures to prevent their misuse.

* Version 1, April 11, 2023. 48 pages

Via

Access Paper or Ask Questions