Abstract:This paper presents a comprehensive comparative analysis of prominent clustering algorithms K-means, DBSCAN, and Spectral Clustering on high-dimensional datasets. We introduce a novel evaluation framework that assesses clustering performance across multiple dimensionality reduction techniques (PCA, t-SNE, and UMAP) using diverse quantitative metrics. Experiments conducted on MNIST, Fashion-MNIST, and UCI HAR datasets reveal that preprocessing with UMAP consistently improves clustering quality across all algorithms, with Spectral Clustering demonstrating superior performance on complex manifold structures. Our findings show that algorithm selection should be guided by data characteristics, with Kmeans excelling in computational efficiency, DBSCAN in handling irregular clusters, and Spectral Clustering in capturing complex relationships. This research contributes a systematic approach for evaluating and selecting clustering techniques for high dimensional data applications.
Abstract:With the rapid growth of IoT devices, ensuring robust network security has become a critical challenge. Traditional intrusion detection systems (IDSs) often face limitations in detecting sophisticated attacks within high-dimensional and complex data environments. This paper presents a novel approach to network anomaly detection using hyperdimensional computing (HDC) techniques, specifically applied to the NSL-KDD dataset. The proposed method leverages the efficiency of HDC in processing large-scale data to identify both known and unknown attack patterns. The model achieved an accuracy of 91.55% on the KDDTrain+ subset, outperforming traditional approaches. These comparative evaluations underscore the model's superior performance, highlighting its potential in advancing anomaly detection for IoT networks and contributing to more secure and intelligent cybersecurity solutions.
Abstract:The rapid expansion of Internet of Things (IoT) networks has introduced new security challenges, necessitating efficient and reliable methods for intrusion detection. In this study, a detection framework based on hyperdimensional computing (HDC) is proposed to identify and classify network intrusions using the NSL-KDD dataset, a standard benchmark for intrusion detection systems. By leveraging the capabilities of HDC, including high-dimensional representation and efficient computation, the proposed approach effectively distinguishes various attack categories such as DoS, probe, R2L, and U2R, while accurately identifying normal traffic patterns. Comprehensive evaluations demonstrate that the proposed method achieves an accuracy of 99.54%, significantly outperforming conventional intrusion detection techniques, making it a promising solution for IoT network security. This work emphasizes the critical role of robust and precise intrusion detection in safeguarding IoT systems against evolving cyber threats.
Abstract:Unsupervised anomaly detection is a promising technique for identifying unusual patterns in data without the need for labeled training examples. This approach is particularly valuable for early case detection in epidemic management, especially when early-stage data are scarce. This research introduces a novel hybrid method for anomaly detection that combines distance and density measures, enhancing its applicability across various infectious diseases. Our method is especially relevant in pandemic situations, as demonstrated during the COVID-19 crisis, where traditional supervised classification methods fall short due to limited data. The efficacy of our method is evaluated using COVID-19 chest X-ray data, where it significantly outperforms established unsupervised techniques. It achieves an average AUC of 77.43%, surpassing the AUC of Isolation Forest at 73.66% and KNN at 52.93%. These results highlight the potential of our hybrid anomaly detection method to improve early detection capabilities in diverse epidemic scenarios, thereby facilitating more effective and timely responses.
Abstract:Chronic Kidney Disease (CKD) is one of the widespread Chronic diseases with no known ultimo cure and high morbidity. Research demonstrates that progressive Chronic Kidney Disease (CKD) is a heterogeneous disorder that significantly impacts kidney structure and functions, eventually leading to kidney failure. With the progression of time, chronic kidney disease has moved from a life-threatening disease affecting few people to a common disorder of varying severity. The goal of this research is to visualize dominating features, feature scores, and values exhibited for early prognosis and detection of CKD using ensemble learning and explainable AI. For that, an AI-driven predictive analytics approach is proposed to aid clinical practitioners in prescribing lifestyle modifications for individual patients to reduce the rate of progression of this disease. Our dataset is collected on body vitals from individuals with CKD and healthy subjects to develop our proposed AI-driven solution accurately. In this regard, blood and urine test results are provided, and ensemble tree-based machine-learning models are applied to predict unseen cases of CKD. Our research findings are validated after lengthy consultations with nephrologists. Our experiments and interpretation results are compared with existing explainable AI applications in various healthcare domains, including CKD. The comparison shows that our developed AI models, particularly the Random Forest model, have identified more features as significant contributors than XgBoost. Interpretability (I), which measures the ratio of important to masked features, indicates that our XgBoost model achieved a higher score, specifically a Fidelity of 98\%, in this metric and naturally in the FII index compared to competing models.