Abstract:Alzheimer's disease (AD) is a complex, progressive neurodegenerative disorder characterized by extracellular A\b{eta} plaques, neurofibrillary tau tangles, glial activation, and neuronal degeneration, involving multiple cell types and pathways. Current models often overlook the cellular context of these pathways. To address this, we developed a multiscale graph neural network (GNN) model, ALZ PINNACLE, using brain omics data from donors spanning the entire aging to AD spectrum. ALZ PINNACLE is based on the PINNACLE GNN framework, which learns context-aware protein, cell type, and tissue representations within a unified latent space. ALZ PINNACLE was trained on 14,951 proteins, 206,850 protein interactions, 7 cell types, and 48 cell subtypes or states. After pretraining, we investigated the learned embedding of APOE, the largest genetic risk factor for AD, across different cell types. Notably, APOE embeddings showed high similarity in microglial, neuronal, and CD8 cells, suggesting a similar role of APOE in these cell types. Fine tuning the model on AD risk genes revealed cell type contexts predictive of the role of APOE in AD. Our results suggest that ALZ PINNACLE may provide a valuable framework for uncovering novel insights into AD neurobiology.
Abstract:Alzheimer's disease is a progressive, debilitating neurodegenerative disease that affects 50 million people globally. Despite this substantial health burden, available treatments for the disease are limited and its fundamental causes remain poorly understood. Previous work has suggested the existence of clinically-meaningful sub-types, which it is suggested may correspond to distinct etiologies, disease courses, and ultimately appropriate treatments. Here, we use unsupervised learning techniques on electronic health records (EHRs) from a cohort of memory disorder patients to characterise heterogeneity in this disease population. Pre-trained embeddings for medical codes as well as transformer-derived Clinical BERT embeddings of free text are used to encode patient EHRs. We identify the existence of sub-populations on the basis of comorbidities and shared textual features, and discuss their clinical significance.