Abstract:Comorbidity, the co-occurrence of multiple medical conditions in a single patient, profoundly impacts disease management and outcomes. Understanding these complex interconnections is crucial, especially in contexts where comorbidities exacerbate outcomes. Leveraging insights from the human interactome (HI) and advancements in graph-based methodologies, this study introduces Transformer with Subgraph Positional Encoding (TSPE) for disease comorbidity prediction. Inspired by Biologically Supervised Embedding (BSE), TSPE employs Transformer's attention mechanisms and Subgraph Positional Encoding (SPE) to capture interactions between nodes and disease associations. Our proposed SPE proves more effective than LPE, as used in Dwivedi et al.'s Graph Transformer, underscoring the importance of integrating clustering and disease-specific information for improved predictive accuracy. Evaluated on real clinical benchmark datasets (RR0 and RR1), TSPE demonstrates substantial performance enhancements over the state-of-the-art method, achieving up to 28.24% higher ROC AUC and 4.93% higher accuracy. This method shows promise for adaptation to other complex graph-based tasks and applications. The source code is available in the GitHub repository at: https://github.com/xihan-qin/TSPE-GraphTransformer.
Abstract:Comorbidity carries significant implications for disease understanding and management. The genetic causes for comorbidity often trace back to mutations occurred either in the same gene associated with two diseases or in different genes associated with different diseases respectively but coming into connection via protein-protein interactions. Therefore, human interactome has been used in more sophisticated study of disease comorbidity. Human interactome, as a large incomplete graph, presents its own challenges to extracting useful features for comorbidity prediction. In this work, we introduce a novel approach named Biologically Supervised Graph Embedding (BSE) to allow for selecting most relevant features to enhance the prediction accuracy of comorbid disease pairs. Our investigation into BSE's impact on both centered and uncentered embedding methods showcases its consistent superiority over the state-of-the-art techniques and its adeptness in selecting dimensions enriched with vital biological insights, thereby improving prediction performance significantly, up to 50% when measured by ROC for some variations. Further analysis indicates that BSE consistently and substantially improves the ratio of disease associations to gene connectivity, affirming its potential in uncovering latent biological factors affecting comorbidity. The statistically significant enhancements across diverse metrics underscore BSE's potential to introduce novel avenues for precise disease comorbidity predictions and other potential applications. The GitHub repository containing the source code can be accessed at the following link: https://github.com/xihan-qin/Biologically-Supervised-Graph-Embedding.