Abstract:While machine learning (ML) includes a valuable array of tools for analyzing biomedical data, significant time and expertise is required to assemble effective, rigorous, and unbiased pipelines. Automated ML (AutoML) tools seek to facilitate ML application by automating a subset of analysis pipeline elements. In this study we develop and validate a Simple, Transparent, End-to-end Automated Machine Learning Pipeline (STREAMLINE) and apply it to investigate the added utility of photography-based phenotypes for predicting obstructive sleep apnea (OSA); a common and underdiagnosed condition associated with a variety of health, economic, and safety consequences. STREAMLINE is designed to tackle biomedical binary classification tasks while adhering to best practices and accommodating complexity, scalability, reproducibility, customization, and model interpretation. Benchmarking analyses validated the efficacy of STREAMLINE across data simulations with increasingly complex patterns of association. Then we applied STREAMLINE to evaluate the utility of demographics (DEM), self-reported comorbidities (DX), symptoms (SYM), and photography-based craniofacial (CF) and intraoral (IO) anatomy measures in predicting any OSA or moderate/severe OSA using 3,111 participants from Sleep Apnea Global Interdisciplinary Consortium (SAGIC). OSA analyses identified a significant increase in ROC-AUC when adding CF to DEM+DX+SYM to predict moderate/severe OSA. A consistent but non-significant increase in PRC-AUC was observed with the addition of each subsequent feature set to predict any OSA, with CF and IO yielding minimal improvements. Application of STREAMLINE to OSA data suggests that CF features provide additional value in predicting moderate/severe OSA, but neither CF nor IO features meaningfully improved the prediction of any OSA beyond established demographics, comorbidity and symptom characteristics.
Abstract:In this study, the development of an automatic algorithm is presented to classify the nocturnal audio recording of an obstructive sleep apnoea (OSA) patient as OSA related snore, simple snore and other sounds. Recent studies has been shown that knowledge regarding the OSA related snore could assist in identifying the site of airway collapse. Audio signal was recorded simultaneously with full-night polysomnography during sleep with a ceiling microphone. Time and frequency features of the nocturnal audio signal were extracted to classify the audio signal into OSA related snore, simple snore and other sounds. Two algorithms were developed to extract OSA related snore using an linear discriminant analysis (LDA) classifier based on the hypothesis that OSA related snoring can assist in identifying the site-of-upper airway collapse. An unbiased nested leave-one patient-out cross-validation process was used to select a high performing feature set from the full set of features. Results indicated that the algorithm achieved an accuracy of 87% for identifying snore events from the audio recordings and an accuracy of 72% for identifying OSA related snore events from the snore events. The direct method to extract OSA-related snore events using a multi-class LDA classifier achieved an accuracy of 64% using the feature selection algorithm. Our results gives a clear indication that OSA-related snore events can be extracted from nocturnal sound recordings, and therefore could potentially be used as a new tool for identifying the site of airway collapse from the nocturnal audio recordings.