Abstract:Today's advanced automotive systems are turning into intelligent Cyber-Physical Systems (CPS), bringing computational intelligence to their cyber-physical context. Such systems power advanced driver assistance systems (ADAS) that observe a vehicle's surroundings for their functionality. However, such ADAS have clear limitations in scenarios when the direct line-of-sight to surrounding objects is occluded, like in urban areas. Imagine now automated driving (AD) systems that ideally could benefit from other vehicles' field-of-view in such occluded situations to increase traffic safety if, for example, locations about pedestrians can be shared across vehicles. Current literature suggests vehicle-to-infrastructure (V2I) via roadside units (RSUs) or vehicle-to-vehicle (V2V) communication to address such issues that stream sensor or object data between vehicles. When considering the ongoing revolution in vehicle system architectures towards powerful, centralized processing units with hardware accelerators, foreseeing the onboard presence of large language models (LLMs) to improve the passengers' comfort when using voice assistants becomes a reality. We are suggesting and evaluating a concept to complement the ego vehicle's field-of-view (FOV) with another vehicle's FOV by tapping into their onboard LLM to let the machines have a dialogue about what the other vehicle ``sees''. Our results show that very recent versions of LLMs, such as GPT-4V and GPT-4o, understand a traffic situation to an impressive level of detail, and hence, they can be used even to spot traffic participants. However, better prompts are needed to improve the detection quality and future work is needed towards a standardised message interchange format between vehicles.
Abstract:Today's advanced driver assistance systems (ADAS), like adaptive cruise control or rear collision warning, are finding broader adoption across vehicle classes. Integrating such advanced, multimodal Large Language Models (LLMs) on board a vehicle, which are capable of processing text, images, audio, and other data types, may have the potential to greatly enhance passenger comfort. Yet, an LLM's hallucinations are still a major challenge to be addressed. In this paper, we systematically assessed potential hallucination detection strategies for such LLMs in the context of object detection in vision-based data on the example of pedestrian detection and localization. We evaluate three hallucination detection strategies applied to two state-of-the-art LLMs, the proprietary GPT-4V and the open LLaVA, on two datasets (Waymo/US and PREPER CITY/Sweden). Our results show that these LLMs can describe a traffic situation to an impressive level of detail but are still challenged for further analysis activities such as object localization. We evaluate and extend hallucination detection approaches when applying these LLMs to video sequences in the example of pedestrian detection. Our experiments show that, at the moment, the state-of-the-art proprietary LLM performs much better than the open LLM. Furthermore, consistency enhancement techniques based on voting, such as the Best-of-Three (BO3) method, do not effectively reduce hallucinations in LLMs that tend to exhibit high false negatives in detecting pedestrians. However, extending the hallucination detection by including information from the past helps to improve results.
Abstract:This action research study focuses on the integration of "AI assistants" in two Agile software development meetings: the Daily Scrum and a feature refinement, a planning meeting that is part of an in-house Scaled Agile framework. We discuss the critical drivers of success, and establish a link between the use of AI and team collaboration dynamics. We conclude with a list of lessons learnt during the interventions in an industrial context, and provide a assessment checklist for companies and teams to reflect on their readiness level. This paper is thus a road-map to facilitate the integration of AI tools in Agile setups.
Abstract:Changes and updates in the requirement artifacts, which can be frequent in the automotive domain, are a challenge for SafetyOps. Large Language Models (LLMs), with their impressive natural language understanding and generating capabilities, can play a key role in automatically refining and decomposing requirements after each update. In this study, we propose a prototype of a pipeline of prompts and LLMs that receives an item definition and outputs solutions in the form of safety requirements. This pipeline also performs a review of the requirement dataset and identifies redundant or contradictory requirements. We first identified the necessary characteristics for performing HARA and then defined tests to assess an LLM's capability in meeting these criteria. We used design science with multiple iterations and let experts from different companies evaluate each cycle quantitatively and qualitatively. Finally, the prototype was implemented at a case company and the responsible team evaluated its efficiency.
Abstract:DevOps is a necessity in many industries, including the development of Autonomous Vehicles. In those settings, there are iterative activities that reduce the speed of SafetyOps cycles. One of these activities is "Hazard Analysis & Risk Assessment" (HARA), which is an essential step to start the safety requirements specification. As a potential approach to increase the speed of this step in SafetyOps, we have delved into the capabilities of Large Language Models (LLMs). Our objective is to systematically assess their potential for application in the field of safety engineering. To that end, we propose a framework to support a higher degree of automation of HARA with LLMs. Despite our endeavors to automate as much of the process as possible, expert review remains crucial to ensure the validity and correctness of the analysis results, with necessary modifications made accordingly.
Abstract:Recent Generative Artificial Intelligence (GenAI) trends focus on various applications, including creating stories, illustrations, poems, articles, computer code, music compositions, and videos. Extrinsic hallucinations are a critical limitation of such GenAI, which can lead to significant challenges in achieving and maintaining the trustworthiness of GenAI. In this paper, we propose two new concepts that we believe will aid the research community in addressing limitations associated with the application of GenAI models. First, we propose a definition for the "desirability" of GenAI outputs and three factors which are observed to influence it. Second, drawing inspiration from Martin Fowler's code smells, we propose the concept of "prompt smells" and the adverse effects they are observed to have on the desirability of GenAI outputs. We expect our work will contribute to the ongoing conversation about the desirability of GenAI outputs and help advance the field in a meaningful way.
Abstract:The explosion of content generated by Artificial Intelligence models has initiated a cultural shift in arts, music, and media, where roles are changing, values are shifting, and conventions are challenged. The readily available, vast dataset of the internet has created an environment for AI models to be trained on any content on the web. With AI models shared openly, and used by many, globally, how does this new paradigm shift challenge the status quo in artistic practices? What kind of changes will AI technology bring into music, arts, and new media?
Abstract:Natural Language Generation tools, such as chatbots that can generate human-like conversational text, are becoming more common both for personal and professional use. However, there are concerns about their trustworthiness and ethical implications. The paper addresses the problem of understanding how different users (e.g., linguists, engineers) perceive and adopt these tools and their perception of machine-generated text quality. It also discusses the perceived advantages and limitations of Natural Language Generation tools, as well as users' beliefs on governance strategies. The main findings of this study include the impact of users' field and level of expertise on the perceived trust and adoption of Natural Language Generation tools, the users' assessment of the accuracy, fluency, and potential biases of machine-generated text in comparison to human-written text, and an analysis of the advantages and ethical risks associated with these tools as identified by the participants. Moreover, this paper discusses the potential implications of these findings for enhancing the AI development process. The paper sheds light on how different user characteristics shape their beliefs on the quality and overall trustworthiness of machine-generated text. Furthermore, it examines the benefits and risks of these tools from the perspectives of different users.
Abstract:Complying with the EU AI Act (AIA) guidelines while developing and implementing AI systems will soon be mandatory within the EU. However, practitioners lack actionable instructions to operationalise ethics during AI systems development. A literature review of different ethical guidelines revealed inconsistencies in the principles addressed and the terminology used to describe them. Furthermore, requirements engineering (RE), which is identified to foster trustworthiness in the AI development process from the early stages was observed to be absent in a lot of frameworks that support the development of ethical and trustworthy AI. This incongruous phrasing combined with a lack of concrete development practices makes trustworthy AI development harder. To address this concern, we formulated a comparison table for the terminology used and the coverage of the ethical AI principles in major ethical AI guidelines. We then examined the applicability of ethical AI development frameworks for performing effective RE during the development of trustworthy AI systems. A tertiary review and meta-analysis of literature discussing ethical AI frameworks revealed their limitations when developing trustworthy AI. Based on our findings, we propose recommendations to address such limitations during the development of trustworthy AI.
Abstract:This study explores the benefits and challenges of integrating Artificial Intelligence with Agile software development methodologies, focusing on improving continuous integration and delivery. A systematic literature review and longitudinal meta-analysis of the retrieved studies was conducted to analyse the role of Artificial Intelligence and it's future applications within Agile software development. The review helped identify critical challenges, such as the need for specialised socio-technical expertise. While Artificial Intelligence holds promise for improved software development practices, further research is needed to better understand its impact on processes and practitioners, and to address the indirect challenges associated with its implementation.