Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity

Oct 04, 2024

Hyosoon Jang, Yunhui Jang, Jaehyung Kim, Sungsoo Ahn

Figure 1 for Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity

Figure 2 for Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity

Figure 3 for Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity

Figure 4 for Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity

Share this with someone who'll enjoy it:

Abstract:Recent advancements in large language models (LLMs) have demonstrated impressive performance in generating molecular structures as drug candidates, which offers significant potential to accelerate drug discovery. However, the current LLMs overlook a critical requirement for drug discovery: proposing a diverse set of molecules. This diversity is essential for improving the chances of finding a viable drug, as it provides alternative molecules that may succeed where others fail in wet-lab or clinical validations. Despite such a need for diversity, the LLMs often output structurally similar molecules from a given prompt. While decoding schemes like beam search may enhance textual diversity, this often does not align with molecular structural diversity. In response, we propose a new method for fine-tuning molecular generative LLMs to autoregressively generate a set of structurally diverse molecules, where each molecule is generated by conditioning on the previously generated molecules. Our approach consists of two stages: (1) supervised fine-tuning to adapt LLMs to autoregressively generate molecules in a sequence and (2) reinforcement learning to maximize structural diversity within the generated molecules. Our experiments show that (1) our fine-tuning approach enables the LLMs to better discover diverse molecules compared to existing decoding schemes and (2) our fine-tuned model outperforms other representative LLMs in generating diverse molecules, including the ones fine-tuned on chemical domains.

View paper on

Share this with someone who'll enjoy it:

Title:Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity

Paper and Code