Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuanqi Du

Assessing generative modeling approaches for free energy estimates in condensed matter

Dec 30, 2025

Maximilian Schebek, Jiajun He, Emil Hoffmann, Yuanqi Du, Frank Noé, Jutta Rogal

Abstract:The accurate estimation of free energy differences between two states is a long-standing challenge in molecular simulations. Traditional approaches generally rely on sampling multiple intermediate states to ensure sufficient overlap in phase space and are, consequently, computationally expensive. Several generative-model-based methods have recently addressed this challenge by learning a direct bridge between distributions, bypassing the need for intermediate states. However, it remains unclear which approaches provide the best trade-off between efficiency, accuracy, and scalability. In this work, we systematically review these methods and benchmark selected approaches with a focus on condensed-matter systems. In particular, we investigate the performance of discrete and continuous normalizing flows in the context of targeted free energy perturbation as well as FEAT (Free energy Estimators with Adaptive Transport) together with the escorted Jarzynski equality, using coarse-grained monatomic ice and Lennard-Jones solids as benchmark systems. We evaluate accuracy, data efficiency, computational cost, and scalability with system size. Our results provide a quantitative framework for selecting effective free energy estimation strategies in condensed-phase systems.

Via

Access Paper or Ask Questions

Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference

Aug 17, 2025

Denis Blessing, Julius Berner, Lorenz Richter, Carles Domingo-Enrich, Yuanqi Du, Arash Vahdat, Gerhard Neumann

Figure 1 for Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference

Figure 2 for Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference

Figure 3 for Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference

Figure 4 for Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference

Abstract:Solving stochastic optimal control problems with quadratic control costs can be viewed as approximating a target path space measure, e.g. via gradient-based optimization. In practice, however, this optimization is challenging in particular if the target measure differs substantially from the prior. In this work, we therefore approach the problem by iteratively solving constrained problems incorporating trust regions that aim for approaching the target measure gradually in a systematic way. It turns out that this trust region based strategy can be understood as a geometric annealing from the prior to the target measure, where, however, the incorporated trust regions lead to a principled and educated way of choosing the time steps in the annealing path. We demonstrate in multiple optimal control applications that our novel method can improve performance significantly, including tasks in diffusion-based sampling, transition path sampling, and fine-tuning of diffusion models.

Via

Access Paper or Ask Questions

RNE: a plug-and-play framework for diffusion density estimation and inference-time control

Jun 06, 2025

Jiajun He, José Miguel Hernández-Lobato, Yuanqi Du, Francisco Vargas

Figure 1 for RNE: a plug-and-play framework for diffusion density estimation and inference-time control

Figure 2 for RNE: a plug-and-play framework for diffusion density estimation and inference-time control

Figure 3 for RNE: a plug-and-play framework for diffusion density estimation and inference-time control

Figure 4 for RNE: a plug-and-play framework for diffusion density estimation and inference-time control

Abstract:In this paper, we introduce the Radon-Nikodym Estimator (RNE), a flexible, plug-and-play framework for diffusion inference-time density estimation and control, based on the concept of the density ratio between path distributions. RNE connects and unifies a variety of existing density estimation and inference-time control methods under a single and intuitive perspective, stemming from basic variational inference and probabilistic principles therefore offering both theoretical clarity and practical versatility. Experiments demonstrate that RNE achieves promising performances in diffusion density estimation and inference-time control tasks, including annealing, composition of diffusion models, and reward-tilting.

* 39 pages; 10 figures

Via

Access Paper or Ask Questions

Building-Block Aware Generative Modeling for 3D Crystals of Metal Organic Frameworks

May 13, 2025

Chenru Duan, Aditya Nandy, Sizhan Liu, Yuanqi Du, Liu He, Yi Qu, Haojun Jia, Jin-Hu Dou

Figure 1 for Building-Block Aware Generative Modeling for 3D Crystals of Metal Organic Frameworks

Figure 2 for Building-Block Aware Generative Modeling for 3D Crystals of Metal Organic Frameworks

Figure 3 for Building-Block Aware Generative Modeling for 3D Crystals of Metal Organic Frameworks

Figure 4 for Building-Block Aware Generative Modeling for 3D Crystals of Metal Organic Frameworks

Abstract:Metal-organic frameworks (MOFs) marry inorganic nodes, organic edges, and topological nets into programmable porous crystals, yet their astronomical design space defies brute-force synthesis. Generative modeling holds ultimate promise, but existing models either recycle known building blocks or are restricted to small unit cells. We introduce Building-Block-Aware MOF Diffusion (BBA MOF Diffusion), an SE(3)-equivariant diffusion model that learns 3D all-atom representations of individual building blocks, encoding crystallographic topological nets explicitly. Trained on the CoRE-MOF database, BBA MOF Diffusion readily samples MOFs with unit cells containing 1000 atoms with great geometric validity, novelty, and diversity mirroring experimental databases. Its native building-block representation produces unprecedented metal nodes and organic edges, expanding accessible chemical space by orders of magnitude. One high-scoring [Zn(1,4-TDC)(EtOH)2] MOF predicted by the model was synthesized, where powder X-ray diffraction, thermogravimetric analysis, and N2 sorption confirm its structural fidelity. BBA-Diff thus furnishes a practical pathway to synthesizable and high-performing MOFs.

Via

Access Paper or Ask Questions

LLM-Augmented Chemical Synthesis and Design Decision Programs

May 11, 2025

Haorui Wang, Jeff Guo, Lingkai Kong, Rampi Ramprasad, Philippe Schwaller, Yuanqi Du, Chao Zhang

Abstract:Retrosynthesis, the process of breaking down a target molecule into simpler precursors through a series of valid reactions, stands at the core of organic chemistry and drug development. Although recent machine learning (ML) research has advanced single-step retrosynthetic modeling and subsequent route searches, these solutions remain restricted by the extensive combinatorial space of possible pathways. Concurrently, large language models (LLMs) have exhibited remarkable chemical knowledge, hinting at their potential to tackle complex decision-making tasks in chemistry. In this work, we explore whether LLMs can successfully navigate the highly constrained, multi-step retrosynthesis planning problem. We introduce an efficient scheme for encoding reaction pathways and present a new route-level search strategy, moving beyond the conventional step-by-step reactant prediction. Through comprehensive evaluations, we show that our LLM-augmented approach excels at retrosynthesis planning and extends naturally to the broader challenge of synthesizable molecular design.

Via

Access Paper or Ask Questions

FEAT: Free energy Estimators with Adaptive Transport

Apr 15, 2025

Jiajun He, Yuanqi Du, Francisco Vargas, Yuanqing Wang, Carla P. Gomes, José Miguel Hernández-Lobato, Eric Vanden-Eijnden

Figure 1 for FEAT: Free energy Estimators with Adaptive Transport

Figure 2 for FEAT: Free energy Estimators with Adaptive Transport

Figure 3 for FEAT: Free energy Estimators with Adaptive Transport

Figure 4 for FEAT: Free energy Estimators with Adaptive Transport

Abstract:We present Free energy Estimators with Adaptive Transport (FEAT), a novel framework for free energy estimation -- a critical challenge across scientific domains. FEAT leverages learned transports implemented via stochastic interpolants and provides consistent, minimum-variance estimators based on escorted Jarzynski equality and controlled Crooks theorem, alongside variational upper and lower bounds on free energy differences. Unifying equilibrium and non-equilibrium methods under a single theoretical framework, FEAT establishes a principled foundation for neural free energy calculations. Experimental validation on toy examples, molecular simulations, and quantum field theory demonstrates improvements over existing learning-based methods.

* 29 pages, 2 tables, 3 figures

Via

Access Paper or Ask Questions

Large Language Models Are Innate Crystal Structure Generators

Feb 28, 2025

Jingru Gan, Peichen Zhong, Yuanqi Du, Yanqiao Zhu, Chenru Duan, Haorui Wang, Carla P. Gomes, Kristin A. Persson, Daniel Schwalbe-Koda, Wei Wang

Figure 1 for Large Language Models Are Innate Crystal Structure Generators

Figure 2 for Large Language Models Are Innate Crystal Structure Generators

Figure 3 for Large Language Models Are Innate Crystal Structure Generators

Figure 4 for Large Language Models Are Innate Crystal Structure Generators

Abstract:Crystal structure generation is fundamental to materials discovery, enabling the prediction of novel materials with desired properties. While existing approaches leverage Large Language Models (LLMs) through extensive fine-tuning on materials databases, we show that pre-trained LLMs can inherently generate stable crystal structures without additional training. Our novel framework MatLLMSearch integrates pre-trained LLMs with evolutionary search algorithms, achieving a 78.38% metastable rate validated by machine learning interatomic potentials and 31.7% DFT-verified stability via quantum mechanical calculations, outperforming specialized models such as CrystalTextLLM. Beyond crystal structure generation, we further demonstrate that our framework can be readily adapted to diverse materials design tasks, including crystal structure prediction and multi-objective optimization of properties such as deformation energy and bulk modulus, all without fine-tuning. These results establish pre-trained LLMs as versatile and effective tools for materials discovery, opening up new venues for crystal structure generation with reduced computational overhead and broader accessibility.

* Preprint, 18 pages

Via

Access Paper or Ask Questions

No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers

Feb 10, 2025

Jiajun He, Yuanqi Du, Francisco Vargas, Dinghuai Zhang, Shreyas Padhy, RuiKang OuYang, Carla Gomes, José Miguel Hernández-Lobato

Figure 1 for No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers

Figure 2 for No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers

Figure 3 for No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers

Figure 4 for No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers

Abstract:We consider the sampling problem, where the aim is to draw samples from a distribution whose density is known only up to a normalization constant. Recent breakthroughs in generative modeling to approximate a high-dimensional data distribution have sparked significant interest in developing neural network-based methods for this challenging problem. However, neural samplers typically incur heavy computational overhead due to simulating trajectories during training. This motivates the pursuit of simulation-free training procedures of neural samplers. In this work, we propose an elegant modification to previous methods, which allows simulation-free training with the help of a time-dependent normalizing flow. However, it ultimately suffers from severe mode collapse. On closer inspection, we find that nearly all successful neural samplers rely on Langevin preconditioning to avoid mode collapsing. We systematically analyze several popular methods with various objective functions and demonstrate that, in the absence of Langevin preconditioning, most of them fail to adequately cover even a simple target. Finally, we draw attention to a strong baseline by combining the state-of-the-art MCMC method, Parallel Tempering (PT), with an additional generative model to shed light on future explorations of neural samplers.

* 21 pages, 5 figures, 6 tables

Via

Access Paper or Ask Questions

Large Language Model is Secretly a Protein Sequence Optimizer

Jan 16, 2025

Yinkai Wang, Jiaxing He, Yuanqi Du, Xiaohui Chen, Jianan Canal Li, Li-Ping Liu, Xiaolin Xu, Soha Hassoun

Figure 1 for Large Language Model is Secretly a Protein Sequence Optimizer

Figure 2 for Large Language Model is Secretly a Protein Sequence Optimizer

Figure 3 for Large Language Model is Secretly a Protein Sequence Optimizer

Figure 4 for Large Language Model is Secretly a Protein Sequence Optimizer

Abstract:We consider the protein sequence engineering problem, which aims to find protein sequences with high fitness levels, starting from a given wild-type sequence. Directed evolution has been a dominating paradigm in this field which has an iterative process to generate variants and select via experimental feedback. We demonstrate large language models (LLMs), despite being trained on massive texts, are secretly protein sequence optimizers. With a directed evolutionary method, LLM can perform protein engineering through Pareto and experiment-budget constrained optimization, demonstrating success on both synthetic and experimental fitness landscapes.

* Preprint

Via

Access Paper or Ask Questions

AlphaNet: Scaling Up Local Frame-based Atomistic Foundation Model

Jan 13, 2025

Bangchen Yin, Jiaao Wang, Weitao Du, Pengbo Wang, Penghua Ying, Haojun Jia, Zisheng Zhang, Yuanqi Du, Carla P. Gomes, Chenru Duan(+2 more)

Figure 1 for AlphaNet: Scaling Up Local Frame-based Atomistic Foundation Model

Figure 2 for AlphaNet: Scaling Up Local Frame-based Atomistic Foundation Model

Figure 3 for AlphaNet: Scaling Up Local Frame-based Atomistic Foundation Model

Figure 4 for AlphaNet: Scaling Up Local Frame-based Atomistic Foundation Model

Abstract:We present AlphaNet, a local frame-based equivariant model designed to achieve both accurate and efficient simulations for atomistic systems. Recently, machine learning force fields (MLFFs) have gained prominence in molecular dynamics simulations due to their advantageous efficiency-accuracy balance compared to classical force fields and quantum mechanical calculations, alongside their transferability across various systems. Despite the advancements in improving model accuracy, the efficiency and scalability of MLFFs remain significant obstacles in practical applications. AlphaNet enhances computational efficiency and accuracy by leveraging the local geometric structures of atomic environments through the construction of equivariant local frames and learnable frame transitions. We substantiate the efficacy of AlphaNet across diverse datasets, including defected graphene, formate decomposition, zeolites, and surface reactions. AlphaNet consistently surpasses well-established models, such as NequIP and DeepPot, in terms of both energy and force prediction accuracy. Notably, AlphaNet offers one of the best trade-offs between computational efficiency and accuracy among existing models. Moreover, AlphaNet exhibits scalability across a broad spectrum of system and dataset sizes, affirming its versatility.

* 14 pages, 5 figures

Via

Access Paper or Ask Questions