Abstract:Mild Cognitive Impairment (MCI) is an early stage of Alzheimer's disease (AD), a form of neurodegenerative disorder. Early identification of MCI is crucial for delaying its progression through timely interventions. Existing research has demonstrated the feasibility of detecting MCI using speech collected from clinical interviews or digital devices. However, these approaches typically analyze data collected at limited time points, limiting their ability to identify cognitive changes over time. This paper presents a longitudinal study using voice assistant systems (VAS) to remotely collect seven-session speech data at three-month intervals across 18 months. We propose two methods to improve MCI detection and the prediction of cognitive changes. The first method incorporates historical data, while the second predicts cognitive changes at two time points. Our results indicate improvements when incorporating historical data: the average F1-score for MCI detection improves from 58.6% to 71.2% (by 12.6%) in the case of acoustic features and from 62.1% to 75.1% (by 13.0%) in the case of linguistic features. Additionally, the prediction of cognitive changes achieves an F1-score of 73.7% in the case of acoustic features. These results confirm the potential of VAS-based speech sessions for early detection of cognitive decline.
Abstract:Adaptation methods are developed to adapt depth foundation models to endoscopic depth estimation recently. However, such approaches typically under-perform training since they limit the parameter search to a low-rank subspace and alter the training dynamics. Therefore, we propose a full-parameter and parameter-efficient learning framework for endoscopic depth estimation. At the first stage, the subspace of attention, convolution and multi-layer perception are adapted simultaneously within different sub-spaces. At the second stage, a memory-efficient optimization is proposed for subspace composition and the performance is further improved in the united sub-space. Initial experiments on the SCARED dataset demonstrate that results at the first stage improves the performance from 10.2% to 4.1% for Sq Rel, Abs Rel, RMSE and RMSE log in the comparison with the state-of-the-art models.
Abstract:Content-based image retrieval (CBIR) with self-supervised learning (SSL) accelerates clinicians' interpretation of similar images without manual annotations. We develop a CBIR from the contrastive learning SimCLR and incorporate a generalized-mean (GeM) pooling followed by L2 normalization to classify lesion types and retrieve similar images before clinicians' analysis. Results have shown improved performance. We additionally build an open-source application for image analysis and retrieval. The application is easy to integrate, relieving manual efforts and suggesting the potential to support clinicians' everyday activities.