Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alberto Sangiovanni-Vincentelli

Generating Probabilistic Scenario Programs from Natural Language

May 03, 2024

Karim Elmaaroufi, Devan Shankar, Ana Cismaru, Marcell Vazquez-Chanlatte, Alberto Sangiovanni-Vincentelli, Matei Zaharia, Sanjit A. Seshia

Figure 1 for Generating Probabilistic Scenario Programs from Natural Language

Figure 2 for Generating Probabilistic Scenario Programs from Natural Language

Figure 3 for Generating Probabilistic Scenario Programs from Natural Language

Abstract:For cyber-physical systems (CPS), including robotics and autonomous vehicles, mass deployment has been hindered by fatal errors that occur when operating in rare events. To replicate rare events such as vehicle crashes, many companies have created logging systems and employed crash reconstruction experts to meticulously recreate these valuable events in simulation. However, in these methods, "what if" questions are not easily formulated and answered. We present ScenarioNL, an AI System for creating scenario programs from natural language. Specifically, we generate these programs from police crash reports. Reports normally contain uncertainty about the exact details of the incidents which we represent through a Probabilistic Programming Language (PPL), Scenic. By using Scenic, we can clearly and concisely represent uncertainty and variation over CPS behaviors, properties, and interactions. We demonstrate how commonplace prompting techniques with the best Large Language Models (LLM) are incapable of reasoning about probabilistic scenario programs and generating code for low-resource languages such as Scenic. Our system is comprised of several LLMs chained together with several kinds of prompting strategies, a compiler, and a simulator. We evaluate our system on publicly available autonomous vehicle crash reports in California from the last five years and share insights into how we generate code that is both semantically meaningful and syntactically correct.

* 17 pages, 2 figures

Via

Access Paper or Ask Questions

Beating Backdoor Attack at Its Own Game

Aug 04, 2023

Min Liu, Alberto Sangiovanni-Vincentelli, Xiangyu Yue

Figure 1 for Beating Backdoor Attack at Its Own Game

Figure 2 for Beating Backdoor Attack at Its Own Game

Figure 3 for Beating Backdoor Attack at Its Own Game

Figure 4 for Beating Backdoor Attack at Its Own Game

Abstract:Deep neural networks (DNNs) are vulnerable to backdoor attack, which does not affect the network's performance on clean data but would manipulate the network behavior once a trigger pattern is added. Existing defense methods have greatly reduced attack success rate, but their prediction accuracy on clean data still lags behind a clean model by a large margin. Inspired by the stealthiness and effectiveness of backdoor attack, we propose a simple but highly effective defense framework which injects non-adversarial backdoors targeting poisoned samples. Following the general steps in backdoor attack, we detect a small set of suspected samples and then apply a poisoning strategy to them. The non-adversarial backdoor, once triggered, suppresses the attacker's backdoor on poisoned data, but has limited influence on clean data. The defense can be carried out during data preprocessing, without any modification to the standard end-to-end training pipeline. We conduct extensive experiments on multiple benchmarks with different architectures and representative attacks. Results demonstrate that our method achieves state-of-the-art defense effectiveness with by far the lowest performance drop on clean data. Considering the surprising defense ability displayed by our framework, we call for more attention to utilizing backdoor for backdoor defense. Code is available at https://github.com/damianliumin/non-adversarial_backdoor.

* Accepted to ICCV 2023

Via

Access Paper or Ask Questions

A Grammar for the Representation of Unmanned Aerial Vehicles with 3D Topologies

Feb 27, 2023

Piergiuseppe Mallozzi, Hussein Sibai, Inigo Incer, Sanjit A. Seshia, Alberto Sangiovanni-Vincentelli

Abstract:We propose a context-sensitive grammar for the systematic exploration of the design space of the topology of 3D robots, particularly unmanned aerial vehicles. It defines production rules for adding components to an incomplete design topology modeled over a 3D grid. The rules are local. The grammar is simple, yet capable of modeling most existing UAVs as well as novel ones. It can be easily generalized to other robotic platforms. It can be thought of as a building block for any design exploration and optimization algorithm.

Via

Access Paper or Ask Questions

Contract-Based Specification Refinement and Repair for Mission Planning

Nov 21, 2022

Piergiuseppe Mallozzi, Inigo Incer, Pierluigi Nuzzo, Alberto Sangiovanni-Vincentelli

Abstract:We address the problem of modeling, refining, and repairing formal specifications for robotic missions using assume-guarantee contracts. We show how to model mission specifications at various levels of abstraction and implement them using a library of pre-implemented specifications. Suppose the specification cannot be met using components from the library. In that case, we compute a proxy for the best approximation to the specification that can be generated using elements from the library. Afterward, we propose a systematic way to either 1) search for and refine the `missing part' of the specification that the library cannot meet or 2) repair the current specification such that the existing library can refine it. Our methodology for searching and repairing mission requirements leverages the quotient, separation, composition, and merging operations between contracts.

Via

Access Paper or Ask Questions

Querying Labelled Data with Scenario Programs for Sim-to-Real Validation

Dec 01, 2021

Edward Kim, Jay Shenoy, Sebastian Junges, Daniel Fremont, Alberto Sangiovanni-Vincentelli, Sanjit Seshia

Figure 1 for Querying Labelled Data with Scenario Programs for Sim-to-Real Validation

Figure 2 for Querying Labelled Data with Scenario Programs for Sim-to-Real Validation

Figure 3 for Querying Labelled Data with Scenario Programs for Sim-to-Real Validation

Figure 4 for Querying Labelled Data with Scenario Programs for Sim-to-Real Validation

Abstract:Simulation-based testing of autonomous vehicles (AVs) has become an essential complement to road testing to ensure safety. Consequently, substantial research has focused on searching for failure scenarios in simulation. However, a fundamental question remains: are AV failure scenarios identified in simulation meaningful in reality, i.e., are they reproducible on the real system? Due to the sim-to-real gap arising from discrepancies between simulated and real sensor data, a failure scenario identified in simulation can be either a spurious artifact of the synthetic sensor data or an actual failure that persists with real sensor data. An approach to validate simulated failure scenarios is to identify instances of the scenario in a corpus of real data, and check if the failure persists on the real data. To this end, we propose a formal definition of what it means for a labelled data item to match an abstract scenario, encoded as a scenario program using the SCENIC probabilistic programming language. Using this definition, we develop a querying algorithm which, given a scenario program and a labelled dataset, finds the subset of data matching the scenario. Experiments demonstrate that our algorithm is accurate and efficient on a variety of realistic traffic scenarios, and scales to a reasonable number of agents.

* pre-print

Via

Access Paper or Ask Questions

Class-wise Thresholding for Detecting Out-of-Distribution Data

Nov 24, 2021

Matteo Guarrera, Baihong Jin, Tung-Wei Lin, Maria Zuluaga, Yuxin Chen, Alberto Sangiovanni-Vincentelli

Figure 1 for Class-wise Thresholding for Detecting Out-of-Distribution Data

Figure 2 for Class-wise Thresholding for Detecting Out-of-Distribution Data

Figure 3 for Class-wise Thresholding for Detecting Out-of-Distribution Data

Figure 4 for Class-wise Thresholding for Detecting Out-of-Distribution Data

Abstract:We consider the problem of detecting OoD(Out-of-Distribution) input data when using deep neural networks, and we propose a simple yet effective way to improve the robustness of several popular OoD detection methods against label shift. Our work is motivated by the observation that most existing OoD detection algorithms consider all training/test data as a whole, regardless of which class entry each input activates (inter-class differences). Through extensive experimentation, we have found that such practice leads to a detector whose performance is sensitive and vulnerable to label shift. To address this issue, we propose a class-wise thresholding scheme that can apply to most existing OoD detection algorithms and can maintain similar OoD detection performance even in the presence of label shift in the test distribution.

* 12 pages, 7 figures, 7 tables

Via

Access Paper or Ask Questions

Conditional Synthetic Data Generation for Robust Machine Learning Applications with Limited Pandemic Data

Sep 14, 2021

Hari Prasanna Das, Ryan Tran, Japjot Singh, Xiangyu Yue, Geoff Tison, Alberto Sangiovanni-Vincentelli, Costas J. Spanos

Figure 1 for Conditional Synthetic Data Generation for Robust Machine Learning Applications with Limited Pandemic Data

Figure 2 for Conditional Synthetic Data Generation for Robust Machine Learning Applications with Limited Pandemic Data

Figure 3 for Conditional Synthetic Data Generation for Robust Machine Learning Applications with Limited Pandemic Data

Figure 4 for Conditional Synthetic Data Generation for Robust Machine Learning Applications with Limited Pandemic Data

Abstract:$\textbf{Background:}$ At the onset of a pandemic, such as COVID-19, data with proper labeling/attributes corresponding to the new disease might be unavailable or sparse. Machine Learning (ML) models trained with the available data, which is limited in quantity and poor in diversity, will often be biased and inaccurate. At the same time, ML algorithms designed to fight pandemics must have good performance and be developed in a time-sensitive manner. To tackle the challenges of limited data, and label scarcity in the available data, we propose generating conditional synthetic data, to be used alongside real data for developing robust ML models. $\textbf{Methods:}$ We present a hybrid model consisting of a conditional generative flow and a classifier for conditional synthetic data generation. The classifier decouples the feature representation for the condition, which is fed to the flow to extract the local noise. We generate synthetic data by manipulating the local noise with fixed conditional feature representation. We also propose a semi-supervised approach to generate synthetic samples in the absence of labels for a majority of the available data. $\textbf{Results:}$ We performed conditional synthetic generation for chest computed tomography (CT) scans corresponding to normal, COVID-19, and pneumonia afflicted patients. We show that our method significantly outperforms existing models both on qualitative and quantitative performance, and our semi-supervised approach can efficiently synthesize conditional samples under label scarcity. As an example of downstream use of synthetic data, we show improvement in COVID-19 detection from CT scans with conditional synthetic data augmentation.

Via

Access Paper or Ask Questions

A Hierarchical State-Machine-Based Framework for Platoon Manoeuvre Descriptions

Apr 12, 2021

Corvin Deboeser, Jordan Ivanchev, Thomas Braud, Alois Knoll, David Eckhoff, Alberto Sangiovanni-Vincentelli

Figure 1 for A Hierarchical State-Machine-Based Framework for Platoon Manoeuvre Descriptions

Figure 2 for A Hierarchical State-Machine-Based Framework for Platoon Manoeuvre Descriptions

Figure 3 for A Hierarchical State-Machine-Based Framework for Platoon Manoeuvre Descriptions

Figure 4 for A Hierarchical State-Machine-Based Framework for Platoon Manoeuvre Descriptions

Abstract:This paper introduces the SEAD framework that simplifies the process of designing and describing autonomous vehicle platooning manoeuvres. Although a large body of research has been formulating platooning manoeuvres, it is still challenging to design, describe, read, and understand them. This difficulty largely arises from missing formalisation. To fill this gap, we analysed existing ways of describing manoeuvres, derived the causes of difficulty, and designed a framework that simplifies the manoeuvre design process. Alongside, a Manoeuvre Design Language was developed to structurally describe manoeuvres in a machine-readable format. Unlike state-of-the-art manoeuvre descriptions that require one state machine for every participating vehicle, the SEAD framework allows describing any manoeuvre from the single perspective of the platoon leader. %As a proof of concept, the proposed framework was implemented in the mixed traffic simulation environment BEHAVE for an autonomous highway scenario. Using this framework, we implemented several manoeuvres as they were described in literature. To demonstrate the applicability of the framework, an experiment was performed to evaluate the execution time performance of multiple alternatives of the Join-Middle manoeuvre. This proof-of-concept experiment revealed that the manoeuvre execution time can be reduced by 28 \% through parallelising various steps without considerable secondary effects. We hope that the SEAD framework will pave the way for further research in the area of new manoeuvre design and optimisation by largely simplifying and unifying platooning manoeuvre representation.

Via

Access Paper or Ask Questions

A Customizable Dynamic Scenario Modeling and Data Generation Platform for Autonomous Driving

Nov 30, 2020

Jay Shenoy, Edward Kim, Xiangyu Yue, Taesung Park, Daniel Fremont, Alberto Sangiovanni-Vincentelli, Sanjit Seshia

Figure 1 for A Customizable Dynamic Scenario Modeling and Data Generation Platform for Autonomous Driving

Figure 2 for A Customizable Dynamic Scenario Modeling and Data Generation Platform for Autonomous Driving

Figure 3 for A Customizable Dynamic Scenario Modeling and Data Generation Platform for Autonomous Driving

Abstract:Safely interacting with humans is a significant challenge for autonomous driving. The performance of this interaction depends on machine learning-based modules of an autopilot, such as perception, behavior prediction, and planning. These modules require training datasets with high-quality labels and a diverse range of realistic dynamic behaviors. Consequently, training such modules to handle rare scenarios is difficult because they are, by definition, rarely represented in real-world datasets. Hence, there is a practical need to augment datasets with synthetic data covering these rare scenarios. In this paper, we present a platform to model dynamic and interactive scenarios, generate the scenarios in simulation with different modalities of labeled sensor data, and collect this information for data augmentation. To our knowledge, this is the first integrated platform for these tasks specialized to the autonomous driving domain.

Via

Access Paper or Ask Questions

Augmenting Monte Carlo Dropout Classification Models with Unsupervised Learning Tasks for Detecting and Diagnosing Out-of-Distribution Faults

Sep 10, 2019

Baihong Jin, Yingshui Tan, Yuxin Chen, Alberto Sangiovanni-Vincentelli

Figure 1 for Augmenting Monte Carlo Dropout Classification Models with Unsupervised Learning Tasks for Detecting and Diagnosing Out-of-Distribution Faults

Figure 2 for Augmenting Monte Carlo Dropout Classification Models with Unsupervised Learning Tasks for Detecting and Diagnosing Out-of-Distribution Faults

Figure 3 for Augmenting Monte Carlo Dropout Classification Models with Unsupervised Learning Tasks for Detecting and Diagnosing Out-of-Distribution Faults

Figure 4 for Augmenting Monte Carlo Dropout Classification Models with Unsupervised Learning Tasks for Detecting and Diagnosing Out-of-Distribution Faults

Abstract:The Monte Carlo dropout method has proved to be a scalable and easy-to-use approach for estimating the uncertainty of deep neural network predictions. This approach was recently applied to Fault Detection and Di-agnosis (FDD) applications to improve the classification performance on incipient faults. In this paper, we propose a novel approach of augmenting the classification model with an additional unsupervised learning task. We justify our choice of algorithm design via an information-theoretical analysis. Our experimental results on three datasets from diverse application domains show that the proposed method leads to improved fault detection and diagnosis performance, especially on out-of-distribution examples including both incipient and unknown faults.

Via

Access Paper or Ask Questions