Abstract:We present a machine learning based COVID-19 cough classifier which is able to discriminate COVID-19 positive coughs from both COVID-19 negative and healthy coughs recorded on a smartphone. This type of screening is non-contact and easily applied, and could help reduce workload in testing centers as well as limit transmission by recommending early self-isolation to those who have a cough suggestive of COVID-19. The two dataset used in this study include subjects from all six continents and contain both forced and natural coughs. The publicly available Coswara dataset contains 92 COVID-19 positive and 1079 healthy subjects, while the second smaller dataset was collected mostly in South Africa and contains 8 COVID-19 positive and 13 COVID-19 negative subjects who have undergone a SARS-CoV laboratory test. Dataset skew was addressed by applying synthetic minority oversampling (SMOTE) and leave-p-out cross validation was used to train and evaluate classifiers. Logistic regression (LR), support vector machines (SVM), multilayer perceptrons (MLP), convolutional neural networks (CNN), long-short term memory (LSTM) and a residual-based neural network architecture (Resnet50) were considered as classifiers. Our results show that the Resnet50 classifier was best able to discriminate between the COVID-19 positive and the healthy coughs with an area under the ROC curve (AUC) of 0.98 while a LSTM classifier was best able to discriminate between the COVID-19 positive and COVID-19 negative coughs with an AUC of 0.94. The LSTM classifier achieved these results using 13 features selected by sequential forward search (SFS). Since it can be implemented on a smartphone, cough audio classification is cost-effective and easy to apply and deploy, and therefore is potentially a useful and viable means of non-contact COVID-19 screening.