Abstract:Search-based software testing (SBST) is a widely adopted technique for testing complex systems with large input spaces, such as Deep Learning-enabled (DL-enabled) systems. Many SBST techniques focus on Pareto-based optimization, where multiple objectives are optimized in parallel to reveal failures. However, it is important to ensure that identified failures are spread throughout the entire failure-inducing area of a search domain and not clustered in a sub-region. This ensures that identified failures are semantically diverse and reveal a wide range of underlying causes. In this paper, we present a theoretical argument explaining why testing based on Pareto optimization is inadequate for covering failure-inducing areas within a search domain. We support our argument with empirical results obtained by applying two widely used types of Pareto-based optimization techniques, namely NSGA-II (an evolutionary algorithm) and MOPSO (a swarm-based algorithm), to two DL-enabled systems: an industrial Automated Valet Parking (AVP) system and a system for classifying handwritten digits. We measure the coverage of failure-revealing test inputs in the input space using a metric that we refer to as the Coverage Inverted Distance quality indicator. Our results show that NSGA-II and MOPSO are not more effective than a na\"ive random search baseline in covering test inputs that reveal failures. The replication package for this study is available in a GitHub repository.
Abstract:An assurance case has become an integral component for the certification of safety-critical systems. While manually defining assurance case patterns can be not avoided, system-specific instantiations of assurance case patterns are both costly and time-consuming. It becomes especially complex to maintain an assurance case for a system when the requirements of the System-Under-Assurance change, or an assurance claim becomes invalid due to, e.g., degradation of a systems component, as common when deploying learning-enabled components. In this paper, we report on our preliminary experience leveraging the tool integration framework Evidential Tool Bus (ETB) for the construction and continuous maintenance of an assurance case from a predefined assurance case pattern. Specifically, we demonstrate the assurance process on an industrial Automated Valet Parking system from the automotive domain. We present the formalization of the provided assurance case pattern in the ETB processable logical specification language of workflows. Our findings show that ETB is able to create and maintain evidence required for the construction of an assurance case.
Abstract:In this paper, we present NSGA-II-SVM (Non-dominated Sorting Genetic Algorithm with Support Vector Machine Guidance), a novel learnable evolutionary and search-based testing algorithm that leverages Support Vector Machine (SVM) classification models to direct the search towards failure-revealing test inputs. Supported by genetic search, NSGA-II-SVM creates iteratively SVM-based models of the test input space, learning which regions in the search space are promising to be explored. A subsequent sampling and repetition of evolutionary search iterations allow to refine and make the model more accurate in the prediction. Our preliminary evaluation of NSGA-II-SVM by testing an Automated Valet Parking system shows that NSGA-II-SVM is more effective in identifying more critical test cases than a state of the art learnable evolutionary testing technique as well as naive random search.