Abstract:Parkinson's disease (PD) poses a growing challenge due to its increasing prevalence, complex pathology, and functional ramifications. Electroencephalography (EEG), when integrated with artificial intelligence (AI), holds promise for monitoring disease progression, identifying sub-phenotypes, and personalizing treatment strategies. However, the effect of medication state on AI model learning and generalization remains poorly understood, potentially limiting EEG-based AI models clinical applicability. This study evaluates how medication state influences the training and generalization of EEG-based AI models. Paired EEG recordings were utilized from individuals with PD in both ON- and OFF-medication states. AI models were trained on recordings from each state separately and evaluated on independent test sets representing both ON- and OFF-medication conditions. Model performance was assessed using multiple metrics, with accuracy (ACC) as the primary outcome. Statistical significance was assessed via permutation testing (p-values<0.05). Our results reveal that models trained on OFF-medication data exhibited consistent but suboptimal performance across both medication states (ACC_OFF-ON=55.3\pm8.8 and ACC_OFF-OFF=56.2\pm8.7). In contrast, models trained on ON-medication data demonstrated significantly higher performance on ON-medication recordings (ACC_ON-ON=80.7\pm7.1) but significantly reduced generalization to OFF-medication data (ACC_ON-OFF=76.0\pm7.2). Notably, models trained on ON-medication data consistently outperformed those trained on OFF-medication data within their respective states (ACC_ON-ON=80.7\pm7.1 and ACC_OFF-OFF=56.2\pm8.7). Our findings suggest that medication state significantly influences the patterns learned by AI models. Addressing this challenge is essential to enhance the robustness and clinical utility of AI models for PD characterization and management.
Abstract:As the number of automatic tools based on machine learning (ML) and resting-state electroencephalography (rs-EEG) for Parkinson's disease (PD) detection keeps growing, the assessment of possible exacerbation of health disparities by means of fairness and bias analysis becomes more relevant. Protected attributes, such as gender, play an important role in PD diagnosis development. However, analysis of sub-group populations stemming from different genders is seldom taken into consideration in ML models' development or the performance assessment for PD detection. In this work, we perform a systematic analysis of the detection ability for gender sub-groups in a multi-center setting of a previously developed ML algorithm based on power spectral density (PSD) features of rs-EEG. We find significant differences in the PD detection ability for males and females at testing time (80.5% vs. 63.7% accuracy) and significantly higher activity for a set of parietal and frontal EEG channels and frequency sub-bands for PD and non-PD males that might explain the differences in the PD detection ability for the gender sub-groups.
Abstract:Resting-state EEG (rs-EEG) has been demonstrated to aid in Parkinson's disease (PD) diagnosis. In particular, the power spectral density (PSD) of low-frequency bands ({\delta} and {\theta}) and high-frequency bands ({\alpha} and \b{eta}) has been shown to be significantly different in patients with PD as compared to subjects without PD (non-PD). However, rs-EEG feature extraction and the interpretation thereof can be time-intensive and prone to examiner variability. Machine learning (ML) has the potential to automatize the analysis of rs-EEG recordings and provides a supportive tool for clinicians to ease their workload. In this work, we use rs-EEG recordings of 84 PD and 85 non-PD subjects pooled from four datasets obtained at different centers. We propose an end-to-end pipeline consisting of preprocessing, extraction of PSD features from clinically validated frequency bands, and feature selection before evaluating the classification ability of the features via ML algorithms to stratify between PD and non-PD subjects. Further, we evaluate the effect of feature harmonization, given the multi-center nature of the datasets. Our validation results show, on average, an improvement in PD detection ability (69.6% vs. 75.5% accuracy) by logistic regression when harmonizing the features and performing univariate feature selection (k = 202 features). Our final results show an average global accuracy of 72.2% with balanced accuracy results for all the centers included in the study: 60.6%, 68.7%, 77.7%, and 82.2%, respectively.