Abstract:For autonomous vehicles, safe navigation in complex environments depends on handling a broad range of diverse and rare driving scenarios. Simulation- and scenario-based testing have emerged as key approaches to development and validation of autonomous driving systems. Traditional scenario generation relies on rule-based systems, knowledge-driven models, and data-driven synthesis, often producing limited diversity and unrealistic safety-critical cases. With the emergence of foundation models, which represent a new generation of pre-trained, general-purpose AI models, developers can process heterogeneous inputs (e.g., natural language, sensor data, HD maps, and control actions), enabling the synthesis and interpretation of complex driving scenarios. In this paper, we conduct a survey about the application of foundation models for scenario generation and scenario analysis in autonomous driving (as of May 2025). Our survey presents a unified taxonomy that includes large language models, vision-language models, multimodal large language models, diffusion models, and world models for the generation and analysis of autonomous driving scenarios. In addition, we review the methodologies, open-source datasets, simulation platforms, and benchmark challenges, and we examine the evaluation metrics tailored explicitly to scenario generation and analysis. Finally, the survey concludes by highlighting the open challenges and research questions, and outlining promising future research directions. All reviewed papers are listed in a continuously maintained repository, which contains supplementary materials and is available at https://github.com/TUM-AVS/FM-for-Scenario-Generation-Analysis.
Abstract:This study investigates the impact of initial contact of drivers with an SAE Level 3 Automated Driving System (ADS) under real traffic conditions, focusing on the Mercedes-Benz Drive Pilot in the EQS. It examines Acceptance, Trust, Usability, and User Experience. Although previous studies in simulated environments provided insights into human-automation interaction, real-world experiences can differ significantly. The research was conducted on a segment of German interstate with 30 participants lacking familiarity with Level 3 ADS. Pre- and post-driving questionnaires were used to assess changes in acceptance and confidence. Supplementary metrics included post-driving ratings for usability and user experience. Findings reveal a significant increase in acceptance and trust following the first contact, confirming results from prior simulator studies. Factors such as Performance Expectancy, Effort Expectancy, Facilitating Condition, Self-Efficacy, and Behavioral Intention to use the vehicle were rated higher after initial contact with the ADS. However, inadequate communication from the ADS to the human driver was detected, highlighting the need for improved communication to prevent misuse or confusion about the operating mode. Contrary to prior research, we found no significant impact of general attitudes towards technological innovation on acceptance and trust. However, it's worth noting that most participants already had a high affinity for technology. Although overall reception was positive and showed an upward trend post first contact, the ADS was also perceived as demanding as manual driving. Future research should focus on a more diverse participant sample and include longer or multiple real-traffic trips to understand behavioral adaptations over time.
Abstract:To evaluate perception components of an automated driving system, it is necessary to define the relevant objects. While the urban domain is popular among perception datasets, relevance is insufficiently specified for this domain. Therefore, this work adopts an existing method to define relevance in the highway domain and expands it to the urban domain. While different conceptualizations and definitions of relevance are present in literature, there is a lack of methods to validate these definitions. Therefore, this work presents a novel relevance validation method leveraging a motion prediction component. The validation leverages the idea that removing irrelevant objects should not influence a prediction component which reflects human driving behavior. The influence on the prediction is quantified by considering the statistical distribution of prediction performance across a large-scale dataset. The validation procedure is verified using criteria specifically designed to exclude relevant objects. The validation method is successfully applied to the relevance criteria from this work, thus supporting their validity.
Abstract:Having efficient testing strategies is a core challenge that needs to be overcome for the release of automated driving. This necessitates clear requirements as well as suitable methods for testing. In this work, the requirements for perception modules are considered with respect to relevance. The concept of relevance currently remains insufficiently defined and specified. In this paper, we propose a novel methodology to overcome this challenge by exemplary application to collision safety in the highway domain. Using this general system and use case specification, a corresponding concept for relevance is derived. Irrelevant objects are thus defined as objects which do not limit the set of safe actions available to the ego vehicle under consideration of all uncertainties. As an initial step, the use case is decomposed into functional scenarios with respect to collision relevance. For each functional scenario, possible actions of both the ego vehicle and any other dynamic object are formalized as equations. This set of possible actions is constrained by traffic rules, yielding relevance criteria. As a result, we present a conservative estimation which dynamic objects are relevant for perception and need to be considered for a complete evaluation. The estimation provides requirements which are applicable for offline testing and validation of perception components. A visualization is presented for examples from the highD dataset, showing the plausibility of the results. Finally, a possibility for a future validation of the presented relevance concept is outlined.