Abstract:Parkinson's disease (PD) is a chronic neurodegenerative disease. Early diagnosis is essential to mitigate the progressive deterioration of patients' quality of life. The most characteristic motor symptoms are very mild in the early stages, making diagnosis difficult. Recent studies have shown that the use of patient voice recordings can aid in early diagnosis. Although the analysis of such recordings is costly from a clinical point of view, advances in machine learning techniques are making the processing of this type of data increasingly accurate and efficient. Vocal recordings contain many features, but it is not known whether all of them are relevant for diagnosing the disease. This paper proposes the use of different types of machine learning models combined with feature selection methods to detect the disease. The selection techniques allow to reduce the number of features used by the classifiers by determining which ones provide the most information about the problem. The results show that machine learning methods, in particular neural networks, are suitable for PD classification and that the number of features can be significantly reduced without affecting the performance of the models.
Abstract:Nowadays, machine learning algorithms continue to grow in complexity and require a substantial amount of computational resources and energy. For these reasons, there is a growing awareness of the development of new green algorithms and distributed AI can contribute to this. Federated learning (FL) is one of the most active research lines in machine learning, as it allows the training of collaborative models in a distributed way, an interesting option in many real-world environments, such as the Internet of Things, allowing the use of these models in edge computing devices. In this work, we present a FL method, based on a neural network without hidden layers, capable of generating a global collaborative model in a single training round, unlike traditional FL methods that require multiple rounds for convergence. This allows obtaining an effective and efficient model that simplifies the management of the training process. Moreover, this method preserve data privacy by design, a crucial aspect in current data protection regulations. We conducted experiments with large datasets and a large number of federated clients. Despite being based on a network model without hidden layers, it maintains in all cases competitive accuracy results compared to more complex state-of-the-art machine learning models. Furthermore, we show that the method performs equally well in both identically and non-identically distributed scenarios. Finally, it is an environmentally friendly algorithm as it allows significant energy savings during the training process compared to its centralized counterpart.