Abstract:Generating adversarial safety-critical scenarios is a pivotal method for testing autonomous driving systems, as it identifies potential weaknesses and enhances system robustness and reliability. However, existing approaches predominantly emphasize unrestricted collision scenarios, prompting non-player character (NPC) vehicles to attack the ego vehicle indiscriminately. These works overlook these scenarios' authenticity, rationality, and relevance, resulting in numerous extreme, contrived, and largely unrealistic collision events involving aggressive NPC vehicles. To rectify this issue, we propose a three-layer relative safety region model, which partitions the area based on danger levels and increases the likelihood of NPC vehicles entering relative boundary regions. This model directs NPC vehicles to engage in adversarial actions within relatively safe boundary regions, thereby augmenting the scenarios' authenticity. We introduce AuthSim, a comprehensive platform for generating authentic and effective safety-critical scenarios by integrating the three-layer relative safety region model with reinforcement learning. To our knowledge, this is the first attempt to address the authenticity and effectiveness of autonomous driving system test scenarios comprehensively. Extensive experiments demonstrate that AuthSim outperforms existing methods in generating effective safety-critical scenarios. Notably, AuthSim achieves a 5.25% improvement in average cut-in distance and a 27.12% enhancement in average collision interval time, while maintaining higher efficiency in generating effective safety-critical scenarios compared to existing methods. This underscores its significant advantage in producing authentic scenarios over current methodologies.
Abstract:Machine learning has been widely used in healthcare applications to approximate complex models, for clinical diagnosis, prognosis, and treatment. As deep learning has the outstanding ability to extract information from time series, its true capabilities on sparse, irregularly sampled, multivariate, and imbalanced physiological data are not yet fully explored. In this paper, we systematically examine the performance of machine learning models for the clinical prediction task based on the EHR, especially physiological time series. We choose Physionet 2019 challenge public dataset to predict Sepsis outcomes in ICU units. Ten baseline machine learning models are compared, including 3 deep learning methods and 7 non-deep learning methods, commonly used in the clinical prediction domain. Nine evaluation metrics with specific clinical implications are used to assess the performance of models. Besides, we sub-sample training dataset sizes and use learning curve fit to investigate the impact of the training dataset size on the performance of the machine learning models. We also propose the general pre-processing method for the physiology time-series data and use Dice Loss to deal with the dataset imbalanced problem. The results show that deep learning indeed outperforms non-deep learning, but with certain conditions: firstly, evaluating with some particular evaluation metrics (AUROC, AUPRC, Sensitivity, and FNR), but not others; secondly, the training dataset size is large enough (with an estimation of a magnitude of thousands).