Abstract:Multimodal large language models (MLLMs) hold the potential to enhance autonomous driving by combining domain-independent world knowledge with context-specific language guidance. Their integration into autonomous driving systems shows promising results in isolated proof-of-concept applications, while their performance is evaluated on selective singular aspects of perception, reasoning, or planning. To leverage their full potential a systematic framework for evaluating MLLMs in the context of autonomous driving is required. This paper proposes a holistic framework for a capability-driven evaluation of MLLMs in autonomous driving. The framework structures scenario understanding along the four core capability dimensions semantic, spatial, temporal, and physical. They are derived from the general requirements of autonomous driving systems, human driver cognition, and language-based reasoning. It further organises the domain into context layers, processing modalities, and downstream tasks such as language-based interaction and decision-making. To illustrate the framework's applicability, two exemplary traffic scenarios are analysed, grounding the proposed dimensions in realistic driving situations. The framework provides a foundation for the structured evaluation of MLLMs' potential for scenario understanding in autonomous driving.
Abstract:During the use of Advanced Driver Assistance Systems (ADAS), drivers can intervene in the active function and take back control due to various reasons. However, the specific reasons for driver-initiated takeovers in naturalistic driving are still not well understood. In order to get more information on the reasons behind these takeovers, a test group study was conducted. There, 17 participants used a predictive longitudinal driving function for their daily commutes and annotated the reasons for their takeovers during active function use. In this paper, the recorded takeovers are analyzed and the different reasons for them are highlighted. The results show that the reasons can be divided into three main categories. The most common category consists of takeovers which aim to adjust the behavior of the ADAS within its Operational Design Domain (ODD) in order to better match the drivers' personal preferences. Other reasons include takeovers due to leaving the ADAS's ODD and corrections of incorrect sensing state information. Using the questionnaire results of the test group study, it was found that the number and frequency of takeovers especially within the ADAS's ODD have a significant negative impact on driver satisfaction. Therefore, the driver satisfaction with the ADAS could be increased by adapting its behavior to the drivers' wishes and thereby lowering the number of takeovers within the ODD. The information contained in the takeover behavior of the drivers could be used as feedback for the ADAS. Finally, it is shown that there are considerable differences in the takeover behavior of different drivers, which shows a need for ADAS individualization.