Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shriram Chennakesavalu

Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale

May 21, 2024

Shriram Chennakesavalu, Frank Hu, Sebastian Ibarraran, Grant M. Rotskoff

Figure 1 for Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale

Figure 2 for Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale

Figure 3 for Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale

Figure 4 for Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale

Abstract:Searching through chemical space is an exceptionally challenging problem because the number of possible molecules grows combinatorially with the number of atoms. Large, autoregressive models trained on databases of chemical compounds have yielded powerful generators, but we still lack robust strategies for generating molecules with desired properties. This molecular search problem closely resembles the "alignment" problem for large language models, though for many chemical tasks we have a specific and easily evaluable reward function. Here, we introduce an algorithm called energy rank alignment (ERA) that leverages an explicit reward function to produce a gradient-based objective that we use to optimize autoregressive policies. We show theoretically that this algorithm is closely related to proximal policy optimization (PPO) and direct preference optimization (DPO), but has a minimizer that converges to an ideal Gibbs-Boltzmann distribution with the reward playing the role of an energy function. Furthermore, this algorithm is highly scalable, does not require reinforcement learning, and performs well relative to DPO when the number of preference observations per pairing is small. We deploy this approach to align molecular transformers to generate molecules with externally specified properties and find that it does so robustly, searching through diverse parts of chemical space. While our focus here is on chemical search, we also obtain excellent results on an AI supervised task for LLM alignment, showing that the method is scalable and general.

Via

Access Paper or Ask Questions

Cooperative multi-agent reinforcement learning for high-dimensional nonequilibrium control

Nov 12, 2021

Shriram Chennakesavalu, Grant M. Rotskoff

Figure 1 for Cooperative multi-agent reinforcement learning for high-dimensional nonequilibrium control

Figure 2 for Cooperative multi-agent reinforcement learning for high-dimensional nonequilibrium control

Abstract:Experimental advances enabling high-resolution external control create new opportunities to produce materials with exotic properties. In this work, we investigate how a multi-agent reinforcement learning approach can be used to design external control protocols for self-assembly. We find that a fully decentralized approach performs remarkably well even with a "coarse" level of external control. More importantly, we see that a partially decentralized approach, where we include information about the local environment allows us to better control our system towards some target distribution. We explain this by analyzing our approach as a partially-observed Markov decision process. With a partially decentralized approach, the agent is able to act more presciently, both by preventing the formation of undesirable structures and by better stabilizing target structures as compared to a fully decentralized approach.

* To appear in the Fourth Workshop on Machine Learning and the Physical Sciences (NeurIPS 2021)

Via

Access Paper or Ask Questions