Abstract:Recent advances in healthcare technologies have led to the availability of large amounts of biological samples across several techniques and applications. In particular, in the last few years, Raman spectroscopy analysis of biological samples has been successfully applied for early-stage diagnosis. However, spectra' inherent complexity and variability make the manual analysis challenging, even for domain experts. For the same reason, the use of traditional Statistical and Machine Learning (ML) techniques could not guarantee for accurate and reliable results. ML models, combined with robust optimization techniques, offer the possibility to improve the classification accuracy and enhance the resilience of predictive models. In this paper, we investigate the performance of a novel robust formulation for Support Vector Machine (SVM) in classifying COVID-19 samples obtained from Raman Spectroscopy. Given the noisy and perturbed nature of biological samples, we protect the classification process against uncertainty through the application of robust optimization techniques. Specifically, we derive robust counterpart models of deterministic formulations using bounded-by-norm uncertainty sets around each observation. We explore the cases of both linear and kernel-induced classifiers to address binary and multiclass classification tasks. The effectiveness of our approach is validated on real-world COVID-19 datasets provided by Italian hospitals by comparing the results of our simulations with a state-of-the-art classifier.