Abstract:Early and accurate detection of Parkinson's disease (PD) remains a critical challenge in medical diagnostics due to the subtlety of early-stage symptoms and the complex, non-linear relationships inherent in biomedical data. Traditional machine learning (ML) models, though widely applied to PD detection, often rely on extensive feature engineering and struggle to capture complex feature interactions. This study investigates the effectiveness of attention-based deep learning models for early PD detection using tabular biomedical data. We present a comparative evaluation of four classification models: Multi-Layer Perceptron (MLP), Gradient Boosting, TabNet, and SAINT, using a benchmark dataset from the UCI Machine Learning Repository consisting of biomedical voice measurements from PD patients and healthy controls. Experimental results show that SAINT consistently outperformed all baseline models across multiple evaluation metrics, achieving a weighted precision of 0.98, weighted recall of 0.97, weighted F1-score of 0.97, a Matthews Correlation Coefficient (MCC) of 0.9990, and the highest Area Under the ROC Curve (AUC-ROC). TabNet and MLP demonstrated competitive performance, while Gradient Boosting yielded the lowest overall scores. The superior performance of SAINT is attributed to its dual attention mechanism, which effectively models feature interactions within and across samples. These findings demonstrate the diagnostic potential of attention-based deep learning architectures for early Parkinson's disease detection and highlight the importance of dynamic feature representation in clinical prediction tasks.




Abstract:Anomaly detection is a critical task that involves the identification of data points that deviate from a predefined pattern, useful for fraud detection and related activities. Various techniques are employed for anomaly detection, but recent research indicates that deep learning methods, with their ability to discern intricate data patterns, are well-suited for this task. This study explores the use of Generative Adversarial Networks (GANs) for anomaly detection in power generation plants. The dataset used in this investigation comprises fuel consumption records obtained from power generation plants operated by a telecommunications company. The data was initially collected in response to observed irregularities in the fuel consumption patterns of the generating sets situated at the company's base stations. The dataset was divided into anomalous and normal data points based on specific variables, with 64.88% classified as normal and 35.12% as anomalous. An analysis of feature importance, employing the random forest classifier, revealed that Running Time Per Day exhibited the highest relative importance. A GANs model was trained and fine-tuned both with and without data augmentation, with the goal of increasing the dataset size to enhance performance. The generator model consisted of five dense layers using the tanh activation function, while the discriminator comprised six dense layers, each integrated with a dropout layer to prevent overfitting. Following data augmentation, the model achieved an accuracy rate of 98.99%, compared to 66.45% before augmentation. This demonstrates that the model nearly perfectly classified data points into normal and anomalous categories, with the augmented data significantly enhancing the GANs' performance in anomaly detection. Consequently, this study recommends the use of GANs, particularly when using large datasets, for effective anomaly detection.