Abstract:Machine learning models have been widely adopted in several fields. However, most recent studies have shown several vulnerabilities from attacks with a potential to jeopardize the integrity of the model, presenting a new window of research opportunity in terms of cyber-security. This survey is conducted with a main intention of highlighting the most relevant information related to security vulnerabilities in the context of machine learning (ML) classifiers; more specifically, directed towards training procedures against data poisoning attacks, representing a type of attack that consists of tampering the data samples fed to the model during the training phase, leading to a degradation in the models accuracy during the inference phase. This work compiles the most relevant insights and findings found in the latest existing literatures addressing this type of attacks. Moreover, this paper also covers several defense techniques that promise feasible detection and mitigation mechanisms, capable of conferring a certain level of robustness to a target model against an attacker. A thorough assessment is performed on the reviewed works, comparing the effects of data poisoning on a wide range of ML models in real-world conditions, performing quantitative and qualitative analyses. This paper analyzes the main characteristics for each approach including performance success metrics, required hyperparameters, and deployment complexity. Moreover, this paper emphasizes the underlying assumptions and limitations considered by both attackers and defenders along with their intrinsic properties such as: availability, reliability, privacy, accountability, interpretability, etc. Finally, this paper concludes by making references of some of main existing research trends that provide pathways towards future research directions in the field of cyber-security.
Abstract:Deep learning applied to electrocardiogram (ECG) data can be used to achieve personal authentication in biometric security applications, but it has not been widely used to diagnose cardiovascular disorders. We developed a deep learning model for the detection of arrhythmia in which time-sliced ECG data representing the distance between successive R-peaks are used as the input for a convolutional neural network (CNN). The main objective is developing the compact deep learning based detect system which minimally uses the dataset but delivers the confident accuracy rate of the Arrhythmia detection. This compact system can be implemented in wearable devices or real-time monitoring equipment because the feature extraction step is not required for complex ECG waveforms, only the R-peak data is needed. The results of both tests indicated that the Compact Arrhythmia Detection System (CADS) matched the performance of conventional systems for the detection of arrhythmia in two consecutive test runs. All features of the CADS are fully implemented and publicly available in MATLAB.
Abstract:Traditional authentication systems use alphanumeric or graphical passwords, or token-based techniques that require "something you know and something you have". The disadvantages of these systems include the risks of forgetfulness, loss, and theft. To address these shortcomings, biometric authentication is rapidly replacing traditional authentication methods and is becoming an everyday part of life. The electrocardiogram (ECG) is one of the most recent traits considered for biometric purposes, and three typical use cases have been described: security checks, hospitals and wearable devices. Here we describe an ECG-based authentication system suitable for security checks and hospital environments. The proposed authentication system will help investigators studying ECG-based biometric authentication techniques to define dataset boundaries and to acquire high-quality training data. We evaluated the performance of the proposed system using a confusion matrix and also by applying the Amang ECG (amgecg) toolbox in MATLAB to investigate two parameters that directly affect the accuracy of authentication: the ECG slicing time (sliding window) and sampling time. Using this approach, we found that accuracy was optimized by using a sliding window of 0.4 s and a sampling time of 37 s.
Abstract:This paper introduces a framework for how to appropriately adopt and adjust Machine Learning (ML) techniques used to construct Electrocardiogram (ECG) based biometric authentication schemes. The proposed framework can help investigators and developers on ECG based biometric authentication mechanisms define the boundaries of required datasets and get training data with good quality. To determine the boundaries of datasets, use case analysis is adopted. Based on various application scenarios on ECG based authentication, three distinct use cases (or authentication categories) are developed. With more qualified training data given to corresponding machine learning schemes, the precision on ML-based ECG biometric authentication mechanisms is increased in consequence. ECG time slicing technique with the R-peak anchoring is utilized in this framework to acquire ML training data with good quality. In the proposed framework four new measure metrics are introduced to evaluate the quality of ML training and testing data. In addition, a Matlab toolbox, containing all proposed mechanisms, metrics and sample data with demonstrations using various ML techniques, is developed and publicly available for further investigation. For developing ML-based ECG biometric authentication, the proposed framework can guide researchers to prepare the proper ML setups and the ML training datasets along with three identified user case scenarios. For researchers adopting ML techniques to design new schemes in other research domains, the proposed framework is still useful for generating ML-based training and testing datasets with good quality and utilizing new measure metrics.