Abstract:The zeitgeist of the digital era has been dominated by an expanding integration of Artificial Intelligence~(AI) in a plethora of applications across various domains. With this expansion, however, questions of the safety and reliability of these methods come have become more relevant than ever. Consequently, a run-time ML model safety system has been developed to ensure the model's operation within the intended context, especially in applications whose environments are greatly variable such as Autonomous Vehicles~(AVs). SafeML is a model-agnostic approach for performing such monitoring, using distance measures based on statistical testing of the training and operational datasets; comparing them to a predetermined threshold, returning a binary value whether the model should be trusted in the context of the observed data or be deemed unreliable. Although a systematic framework exists for this approach, its performance is hindered by: (1) a dependency on a number of design parameters that directly affect the selection of a safety threshold and therefore likely affect its robustness, (2) an inherent assumption of certain distributions for the training and operational sets, as well as (3) a high computational complexity for relatively large sets. This work addresses these limitations by changing the binary decision to a continuous metric. Furthermore, all data distribution assumptions are made obsolete by implementing non-parametric approaches, and the computational speed increased by introducing a new distance measure based on the Empirical Characteristics Functions~(ECF).
Abstract:Machine learning is currently undergoing an explosion in capability, popularity, and sophistication. However, one of the major barriers to widespread acceptance of machine learning (ML) is trustworthiness: most ML models operate as black boxes, their inner workings opaque and mysterious, and it can be difficult to trust their conclusions without understanding how those conclusions are reached. Explainability is therefore a key aspect of improving trustworthiness: the ability to better understand, interpret, and anticipate the behaviour of ML models. To this end, we propose SMILE, a new method that builds on previous approaches by making use of statistical distance measures to improve explainability while remaining applicable to a wide range of input data domains.
Abstract:The use of Unmanned Arial Vehicles (UAVs) offers many advantages across a variety of applications. However, safety assurance is a key barrier to widespread usage, especially given the unpredictable operational and environmental factors experienced by UAVs, which are hard to capture solely at design-time. This paper proposes a new reliability modeling approach called SafeDrones to help address this issue by enabling runtime reliability and risk assessment of UAVs. It is a prototype instantiation of the Executable Digital Dependable Identity (EDDI) concept, which aims to create a model-based solution for real-time, data-driven dependability assurance for multi-robot systems. By providing real-time reliability estimates, SafeDrones allows UAVs to update their missions accordingly in an adaptive manner.
Abstract:Machine Learning~(ML) has provided promising results in recent years across different applications and domains. However, in many cases, qualities such as reliability or even safety need to be ensured. To this end, one important aspect is to determine whether or not ML components are deployed in situations that are appropriate for their application scope. For components whose environments are open and variable, for instance those found in autonomous vehicles, it is therefore important to monitor their operational situation to determine its distance from the ML components' trained scope. If that distance is deemed too great, the application may choose to consider the ML component outcome unreliable and switch to alternatives, e.g. using human operator input instead. SafeML is a model-agnostic approach for performing such monitoring, using distance measures based on statistical testing of the training and operational datasets. Limitations in setting SafeML up properly include the lack of a systematic approach for determining, for a given application, how many operational samples are needed to yield reliable distance information as well as to determine an appropriate distance threshold. In this work, we address these limitations by providing a practical approach and demonstrate its use in a well known traffic sign recognition problem, and on an example using the CARLA open-source automotive simulator.
Abstract:Reliability estimation of Machine Learning (ML) models is becoming a crucial subject. This is particularly the case when such \mbox{models} are deployed in safety-critical applications, as the decisions based on model predictions can result in hazardous situations. In this regard, recent research has proposed methods to achieve safe, \mbox{dependable}, and reliable ML systems. One such method consists of detecting and analyzing distributional shift, and then measuring how such systems respond to these shifts. This was proposed in earlier work in SafeML. This work focuses on the use of SafeML for time series data, and on reliability and robustness estimation of ML-forecasting methods using statistical distance measures. To this end, distance measures based on the Empirical Cumulative Distribution Function (ECDF) proposed in SafeML are explored to measure Statistical-Distance Dissimilarity (SDD) across time series. We then propose SDD-based Reliability Estimate (StaDRe) and SDD-based Robustness (StaDRo) measures. With the help of a clustering technique, the similarity between the statistical properties of data seen during training and the forecasts is identified. The proposed method is capable of providing a link between dataset SDD and Key Performance Indicators (KPIs) of the ML models.
Abstract:Ensuring safety and explainability of machine learning (ML) is a topic of increasing relevance as data-driven applications venture into safety-critical application domains, traditionally committed to high safety standards that are not satisfied with an exclusive testing approach of otherwise inaccessible black-box systems. Especially the interaction between safety and security is a central challenge, as security violations can lead to compromised safety. The contribution of this paper to addressing both safety and security within a single concept of protection applicable during the operation of ML systems is active monitoring of the behaviour and the operational context of the data-driven system based on distance measures of the Empirical Cumulative Distribution Function (ECDF). We investigate abstract datasets (XOR, Spiral, Circle) and current security-specific datasets for intrusion detection (CICIDS2017) of simulated network traffic, using distributional shift detection measures including the Kolmogorov-Smirnov, Kuiper, Anderson-Darling, Wasserstein and mixed Wasserstein-Anderson-Darling measures. Our preliminary findings indicate that the approach can provide a basis for detecting whether the application context of an ML component is valid in the safety-security. Our preliminary code and results are available at https://github.com/ISorokos/SafeML.