Abstract:Speech recordings are being more frequently used to detect and monitor disease, leading to privacy concerns. Beyond cryptography, protection of speech can be addressed by approaches, such as perturbation, disentanglement, and re-synthesis, that eliminate sensitive information of the speaker, leaving the information necessary for medical analysis purposes. In order for such privacy protective approaches to be developed, clear and systematic specifications of assumptions concerning medical settings and the needs of medical professionals are necessary. In this paper, we propose a Scenario of Use Scheme that incorporates an Attacker Model, which characterizes the adversary against whom the speaker's privacy must be defended, and a Protector Model, which specifies the defense. We discuss the connection of the scheme with previous work on speech privacy. Finally, we present a concrete example of a specified Scenario of Use and a set of experiments about protecting speaker data against gender inference attacks while maintaining utility for Parkinson's detection.
Abstract:Parkinson's disease (PD), the second most prevalent neurodegenerative disorder worldwide, frequently presents with early-stage speech impairments. Recent advancements in Artificial Intelligence (AI), particularly deep learning (DL), have significantly enhanced PD diagnosis through the analysis of speech data. Nevertheless, the progress of research is restricted by the limited availability of publicly accessible speech-based PD datasets, primarily due to privacy and ethical concerns. This review covers the latest DL-based AI approaches for speech-based PD classification, focusing on performance, available resources and associated challenges of 33 scientific works published between 2020 and March 2024. These DL approaches are categorized into end-to-end (E2E) learning, transfer learning (TL) and deep acoustic features (DAF) extraction. Among E2E approaches, Convolutional Neural Networks (CNNs) are prevalent, though Transformers are increasingly popular. E2E approaches face challenges such as limited data and computational resources, especially with Transformers. TL addresses these issues by providing more robust PD diagnosis and better generalizability across languages. DAF extraction aims to improve the explainability and interpretability of results by examining the specific effects of deep features on both other DL approaches and more traditional machine learning (ML) methods. However, it often underperforms compared to E2E and TL approaches. This review also discusses unresolved issues related to bias, explainability and privacy, highlighting the need for future research.