Abstract:In the expansive realm of drug discovery, with approximately 15,000 known drugs and only around 4,200 approved, the combinatorial nature of the chemical space presents a formidable challenge. While Artificial Intelligence (AI) has emerged as a powerful ally, traditional AI frameworks face significant hurdles. This manuscript introduces CardiGraphormer, a groundbreaking approach that synergizes self-supervised learning (SSL), Graph Neural Networks (GNNs), and Cardinality Preserving Attention to revolutionize drug discovery. CardiGraphormer, a novel combination of Graphormer and Cardinality Preserving Attention, leverages SSL to learn potent molecular representations and employs GNNs to extract molecular fingerprints, enhancing predictive performance and interpretability while reducing computation time. It excels in handling complex data like molecular structures and performs tasks associated with nodes, pairs of nodes, subgraphs, or entire graph structures. CardiGraphormer's potential applications in drug discovery and drug interactions are vast, from identifying new drug targets to predicting drug-to-drug interactions and enabling novel drug discovery. This innovative approach provides an AI-enhanced methodology in drug development, utilizing SSL combined with GNNs to overcome existing limitations and pave the way for a richer exploration of the vast combinatorial chemical space in drug discovery.
Abstract:Identifying reaction coordinates(RCs) is an active area of research, given the crucial role RCs play in determining the progress of a chemical reaction. The choice of the reaction coordinate is often based on heuristic knowledge. However, an essential criterion for the choice is that the coordinate should capture both the reactant and product states unequivocally. Also, the coordinate should be the slowest one so that all the other degrees of freedom can easily equilibrate along the reaction coordinate. Also, the coordinate should be the slowest one so that all the other degrees of freedom can easily equilibrate along the reaction coordinate. We used a regularised sparse autoencoder, an energy-based model, to discover a crucial set of reaction coordinates. Along with discovering reaction coordinates, our model also predicts the evolution of a molecular dynamics(MD) trajectory. We showcased that including sparsity enforcing regularisation helps in choosing a small but important set of reaction coordinates. We used two model systems to demonstrate our approach: alanine dipeptide system and proflavine and DNA system, which exhibited intercalation of proflavine into DNA minor groove in an aqueous environment. We model MD trajectory as a multivariate time series, and our latent variable model performs the task of multi-step time series prediction. This idea is inspired by the popular sparse coding approach - to represent each input sample as a linear combination of few elements taken from a set of representative patterns.