Abstract:Supervised classification recognizes patterns in the data to separate classes of behaviours. Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning. The data analyst may minimize the classification error on a class at the expense of increasing the error of the other classes. The error control of such a design phase is often done in a heuristic manner. In this context, it is key to develop theoretical foundations capable of providing probabilistic certifications to the obtained classifiers. In this perspective, we introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled. The notion of scalable classifiers is then exploited to link the tuning of machine learning with error control. Several tests corroborate the approach. They are provided through synthetic data in order to highlight all the steps involved, as well as through a smart mobility application.
Abstract:This document analyzes the role of data-driven methodologies in Covid-19 pandemic. We provide a SWOT analysis and a roadmap that goes from the access to data sources to the final decision-making step. We aim to review the available methodologies while anticipating the difficulties and challenges in the development of data-driven strategies to combat the Covid-19 pandemic. A 3M-analysis is presented: Monitoring, Modelling and Making decisions. The focus is on the potential of well-known datadriven schemes to address different challenges raised by the pandemic: i) monitoring and forecasting the spread of the epidemic; (ii) assessing the effectiveness of government decisions; (iii) making timely decisions. Each step of the roadmap is detailed through a review of consolidated theoretical results and their potential application in the Covid-19 context. When possible, we provide examples of their applications on past or present epidemics. We do not provide an exhaustive enumeration of methodologies, algorithms and applications. We do try to serve as a bridge between different disciplines required to provide a holistic approach to the epidemic: data science, epidemiology, controltheory, etc. That is, we highlight effective data-driven methodologies that have been shown to be successful in other contexts and that have potential application in the different steps of the proposed roadmap. To make this document more functional and adapted to the specifics of each discipline, we encourage researchers and practitioners to provide feedback. We will update this document regularly.
Abstract:In recent years, the increasing interest in Stochastic model predictive control (SMPC) schemes has highlighted the limitation arising from their inherent computational demand, which has restricted their applicability to slow-dynamics and high-performing systems. To reduce the computational burden, in this paper we extend the probabilistic scaling approach to obtain low-complexity inner approximation of chance-constrained sets. This approach provides probabilistic guarantees at a lower computational cost than other schemes for which the sample complexity depends on the design space dimension. To design candidate simple approximating sets, which approximate the shape of the probabilistic set, we introduce two possibilities: i) fixed-complexity polytopes, and ii) $\ell_p$-norm based sets. Once the candidate approximating set is obtained, it is scaled around its center so to enforce the expected probabilistic guarantees. The resulting scaled set is then exploited to enforce constraints in the classical SMPC framework. The computational gain obtained with the proposed approach with respect to the scenario one is demonstrated via simulations, where the objective is the control of a fixed-wing UAV performing a monitoring mission over a sloped vineyard.
Abstract:We provide an insight into the open data resources pertinent to the study of the spread of Covid-19 pandemic and its control. We identify the variables required to analyze fundamental aspects like seasonal behaviour, regional mortality rates, and effectiveness of government measures. Open data resources, along with data-driven methodologies, provide many opportunities to improve the response of the different administrations to the virus. We describe the present limitations and difficulties encountered in most of the open-data resources. To facilitate the access to the main open-data portals and resources, we identify the most relevant institutions, at a world scale, providing Covid-19 information and/or auxiliary variables (demographics, mobility, etc.). We also describe several open resources to access Covid-19 data-sets at a country-wide level (i.e. China, Italy, Spain, France, Germany, U.S., etc.). In an attempt to facilitate the rapid response to the study of the seasonal behaviour of Covid-19, we enumerate the main open resources in terms of weather and climate variables. CONCO-Team: The authors of this paper belong to the CONtrol COvid-19 Team, which is composed of different researches from universities of Spain, Italy, France, Germany, United Kingdom and Argentina. The main goal of CONCO-Team is to develop data-driven methods for the better understanding and control of the pandemic.