Abstract:This research investigates the impact of social robot participation in group conversations and assesses the effectiveness of various addressing policies. The study involved 300 participants, divided into groups of four, interacting with a humanoid robot serving as the moderator. The robot utilized conversation data to determine the most appropriate speaker to address. The findings indicate that the robot's addressing policy significantly influenced conversation dynamics, resulting in more balanced attention to each participant and a reduction in subgroup formation.
Abstract:The ROS 2 Navigation Stack (Nav2) has emerged as a widely used software component providing the underlying basis to develop a variety of high-level functionalities. However, when used in outdoor environments such as orchards and vineyards, its functionality is notably limited by the presence of obstacles and/or situations not commonly found in indoor settings. One such example is given by tall grass and weeds that can be safely traversed by a robot, but that can be perceived as obstacles by LiDAR sensors, and then force the robot to take longer paths to avoid them, or abort navigation altogether. To overcome these limitations, domain specific extensions must be developed and integrated into the software pipeline. This paper presents a new, lightweight approach to address this challenge and improve outdoor robot navigation. Leveraging the multi-scale nature of the costmaps supporting Nav2, we developed a system that using a depth camera performs pixel level classification on the images, and in real time injects corrections into the local cost map, thus enabling the robot to traverse areas that would otherwise be avoided by the Nav2. Our approach has been implemented and validated on a Clearpath Husky and we demonstrate that with this extension the robot is able to perform navigation tasks that would be otherwise not practical with the standard components.
Abstract:Co-speech gesture generation on artificial agents has gained attention recently, mainly when it is based on data-driven models. However, end-to-end methods often fail to generate co-speech gestures related to semantics with specific forms, i.e., Symbolic and Deictic gestures. In this work, we identify which words in a sentence are contextually related to Symbolic and Deictic gestures. Firstly, we appropriately chose 12 gestures recognized by people from the Italian culture, which different humanoid robots can reproduce. Then, we implemented two rule-based algorithms to label sentences with Symbolic and Deictic gestures. The rules depend on the semantic similarity scores computed with the RoBerta model between sentences that heuristically represent gestures and sub-sentences inside an objective sentence that artificial agents have to pronounce. We also implemented a baseline algorithm that assigns gestures without computing similarity scores. Finally, to validate the results, we asked 30 persons to label a set of sentences with Deictic and Symbolic gestures through a Graphical User Interface (GUI), and we compared the labels with the ones produced by our algorithms. For this scope, we computed Average Precision (AP) and Intersection Over Union (IOU) scores, and we evaluated the Average Computational Time (ACT). Our results show that semantic similarity scores are useful for finding Symbolic and Deictic gestures in utterances.
Abstract:This paper presents a system for diversity-aware autonomous conversation leveraging the capabilities of large language models (LLMs). The system adapts to diverse populations and individuals, considering factors like background, personality, age, gender, and culture. The conversation flow is guided by the structure of the system's pre-established knowledge base, while LLMs are tasked with various functions, including generating diversity-aware sentences. Achieving diversity-awareness involves providing carefully crafted prompts to the models, incorporating comprehensive information about users, conversation history, contextual details, and specific guidelines. To assess the system's performance, we conducted both controlled and real-world experiments, measuring a wide range of performance indicators.
Abstract:In the realm of autonomous robotics, a critical challenge lies in developing robust solutions for Active Collaborative SLAM, wherein multiple robots must collaboratively explore and map an unknown environment while intelligently coordinating their movements and sensor data acquisitions. To this aim, we present two approaches for coordinating a system consisting of multiple robots to perform Active Collaborative SLAM (AC-SLAM) for environmental exploration. Our two coordination approaches, synchronous and asynchronous implement a methodology to prioritize robot goal assignments by the central server. We also present a method to efficiently spread the robots for maximum exploration while keeping SLAM uncertainty low. Both coordination approaches were evaluated through simulation on publicly available datasets, obtaining promising results.
Abstract:The article introduces the concept of "diversity-aware" robotics and discusses the need to develop computational models to embed robots with diversity-awareness: that is, robots capable of adapting and re-configuring their behavior to recognize, respect, and value the uniqueness of the person they interact with to promote inclusion regardless of their age, race, gender, cognitive or physical capabilities, etc. Finally, the article discusses possible technical solutions based on Ontologies and Bayesian Networks, starting from previous experience with culturally competent robots.
Abstract:This article presents the design and the implementation of CAIR: a cloud system for knowledge-based autonomous interaction devised for Social Robots and other conversational agents. The system is particularly convenient for low-cost robots and devices. Developers are provided with a sustainable solution to manage verbal and non-verbal interaction through a network connection, with about 3,000 topics of conversation ready for "chit-chatting" and a library of pre-cooked plans that only needs to be grounded into the robot's physical capabilities. The system is structured as a set of REST API endpoints so that it can be easily expanded by adding new APIs to improve the capabilities of the clients connected to the cloud. Another key feature of the system is that it has been designed to make the development of its clients straightforward: in this way, multiple devices can be easily endowed with the capability of autonomously interacting with the user, understanding when to perform specific actions, and exploiting all the information provided by cloud services. The article outlines and discusses the results of the experiments performed to assess the system's performance in terms of response time, paving the way for its use both for research and market solutions. Links to repositories with clients for ROS and popular robots such as Pepper and NAO are given.
Abstract:Since the demand for renewable solar energy is continuously growing, the need for more frequent, precise, and quick autonomous aerial inspections using Unmanned Aerial Vehicles (UAV) may become fundamental to reduce costs. However, UAV-based inspection of Photovoltaic (PV) arrays is still an open problem. Companies in the field complain that GPS-based navigation is not adequate to accurately cover PV arrays to acquire images to be analyzed to determine the PV panels' status. Indeed, when instructing UAVs to move along a sequency of waypoints at a low altitude, two sources of errors may deteriorate performances: (i) the difference between the actual UAV position and the one estimated with the GPS, and (ii) the difference between the UAV position returned by the GPS and the position of waypoints extracted from georeferenced images acquired through Google Earth or similar tools. These errors make it impossible to reliably track rows of PV modules without human intervention reliably. The article proposes an approach for inspecting PV arrays with autonomous UAVs equipped with an RGB and a thermal camera, the latter being typically used to detect heat failures on the panels' surface: we introduce a portfolio of techniques to process data from both cameras for autonomous navigation. %, including an optimization procedure for improving panel detection and an Extended Kalman Filter (EKF) to filter data from RGB and thermal cameras. Experimental tests performed in simulation and an actual PV plant are reported, confirming the validity of the approach.
Abstract:This article introduces the concept of image "culturization", i.e., defined as the process of altering the "brushstroke of cultural features" that make objects perceived as belonging to a given culture while preserving their functionalities. First, we propose a pipeline for translating objects' images from a source to a target cultural domain based on Generative Adversarial Networks (GAN). Then, we gather data through an online questionnaire to test four hypotheses concerning the preferences of Italian participants towards objects and environments belonging to different cultures. As expected, results depend on individual tastes and preference: however, they are in line with our conjecture that some people, during the interaction with a robot or another intelligent system, might prefer to be shown images whose cultural domain has been modified to match their cultural background.
Abstract:The article proposes a system for knowledge-based conversation designed for Social Robots and other conversational agents. The proposed system relies on an Ontology for the description of all concepts that may be relevant conversation topics, as well as their mutual relationships. The article focuses on the algorithm for Dialogue Management that selects the most appropriate conversation topic depending on the user's input. Moreover, it discusses strategies to ensure a conversation flow that captures, as more coherently as possible, the user's intention to drive the conversation in specific directions while avoiding purely reactive responses to what the user says. To measure the quality of the conversation, the article reports the tests performed with 100 recruited participants, comparing five conversational agents: (i) an agent addressing dialogue flow management based only on the detection of keywords in the speech, (ii) an agent based both on the detection of keywords and the Content Classification feature of Google Cloud Natural Language, (iii) an agent that picks conversation topics randomly, (iv) a human pretending to be a chatbot, and (v) one of the most famous chatbots worldwide: Replika. The subjective perception of the participants is measured both with the SASSI (Subjective Assessment of Speech System Interfaces) tool, as well as with a custom survey for measuring the subjective perception of coherence.