Abstract:In this study, we generate and maintain a database of 10 million virtual lipids through METiS's in-house de novo lipid generation algorithms and lipid virtual screening techniques. These virtual lipids serve as a corpus for pre-training, lipid representation learning, and downstream task knowledge transfer, culminating in state-of-the-art LNP property prediction performance. We propose LipidBERT, a BERT-like model pre-trained with the Masked Language Model (MLM) and various secondary tasks. Additionally, we compare the performance of embeddings generated by LipidBERT and PhatGPT, our GPT-like lipid generation model, on downstream tasks. The proposed bilingual LipidBERT model operates in two languages: the language of ionizable lipid pre-training, using in-house dry-lab lipid structures, and the language of LNP fine-tuning, utilizing in-house LNP wet-lab data. This dual capability positions LipidBERT as a key AI-based filter for future screening tasks, including new versions of METiS de novo lipid libraries and, more importantly, candidates for in vivo testing for orgran-targeting LNPs. To the best of our knowledge, this is the first successful demonstration of the capability of a pre-trained language model on virtual lipids and its effectiveness in downstream tasks using web-lab data. This work showcases the clever utilization of METiS's in-house de novo lipid library as well as the power of dry-wet lab integration.
Abstract:The concept of path homotopy has received widely attention in the field of path planning in recent years. However, as far as we know, there is no method that fast and efficiently determines the congruence between paths and can be used directly to guide the path planning process. In this article, a topological encoder based on convex dissection for a two-dimensional bounded Euclidean space is developed, which can efficiently encode all homotopy path classes between any two points. Thereafter, the optimal path planning task is thus consisted of two steps: (i) search for the homotopy path class that may contain the optimal path, and (ii) obtain the shortest homotopy path in this class. Furthermore, an optimal path planning algorithm called RWCDT (Random Walk based on Convex Division Topology), is proposed. RWCDT uses a constrained random walk search algorithm to search for different homotopy path classes and applies an iterative compression algorithm to obtain the shortest path in each class. Through a series of experiments, it was determined that the performance of the proposed algorithm is comparable with state-of-the-art path planning algorithms. Hence, the application significance of the developed homotopy path class encoder in the field of path planning was verified.