Abstract:The fusion of Large Language Models (LLMs) and robotic systems has led to a transformative paradigm in the robotic field, offering unparalleled capabilities not only in the communication domain but also in skills like multimodal input handling, high-level reasoning, and plan generation. The grounding of LLMs knowledge into the empirical world has been considered a crucial pathway to exploit the efficiency of LLMs in robotics. Nevertheless, connecting LLMs' representations to the external world with multimodal approaches or with robots' bodies is not enough to let them understand the meaning of the language they are manipulating. Taking inspiration from humans, this work draws attention to three necessary elements for an agent to grasp and experience the world. The roadmap for LLMs grounding is envisaged in an active bodily system as the reference point for experiencing the environment, a temporally structured experience for a coherent, self-related interaction with the external world, and social skills to acquire a common-grounded shared experience.
Abstract:Large Language Models (LLMs) have been recently used in robot applications for grounding LLM common-sense reasoning with the robot's perception and physical abilities. In humanoid robots, memory also plays a critical role in fostering real-world embodiment and facilitating long-term interactive capabilities, especially in multi-task setups where the robot must remember previous task states, environment states, and executed actions. In this paper, we address incorporating memory processes with LLMs for generating cross-task robot actions, while the robot effectively switches between tasks. Our proposed dual-layered architecture features two LLMs, utilizing their complementary skills of reasoning and following instructions, combined with a memory model inspired by human cognition. Our results show a significant improvement in performance over a baseline of five robotic tasks, demonstrating the potential of integrating memory with LLMs for combining the robot's action and perception for adaptive task execution.
Abstract:This study presents novel strategies to investigate the mutual influence of trust and group dynamics in children-robot interaction. We implemented a game-like experimental activity with the humanoid robot iCub and designed a questionnaire to assess how the children perceived the interaction. We also aim to verify if the sensors, setups, and tasks are suitable for studying such aspects. The questionnaires' results demonstrate that youths perceive iCub as a friend and, typically, in a positive way. Other preliminary results suggest that, generally, children trusted iCub during the activity and, after its mistakes, they tried to reassure it with sentences such as: "Don't worry iCub, we forgive you". Furthermore, trust towards the robot in group cognitive activity appears to change according to gender: after two consecutive mistakes by the robot, girls tended to trust iCub more than boys. Finally, no significant difference has been evidenced between different age groups across points computed from the game and the self-reported scales. The tool we proposed is suitable for studying trust in human-robot interaction (HRI) across different ages and seems appropriate to understand the mechanism of trust in group interactions.
Abstract:The ability to recognize human partners is an important social skill to build personalized and long-term human-robot interactions, especially in scenarios like education, care-giving, and rehabilitation. Faces and voices constitute two important sources of information to enable artificial systems to reliably recognize individuals. Deep learning networks have achieved state-of-the-art results and demonstrated to be suitable tools to address such a task. However, when those networks are applied to different and unprecedented scenarios not included in the training set, they can suffer a drop in performance. For example, with robotic platforms in ever-changing and realistic environments, where always new sensory evidence is acquired, the performance of those models degrades. One solution is to make robots learn from their first-hand sensory data with self-supervision. This allows coping with the inherent variability of the data gathered in realistic and interactive contexts. To this aim, we propose a cognitive architecture integrating low-level perceptual processes with a spatial working memory mechanism. The architecture autonomously organizes the robot's sensory experience into a structured dataset suitable for human recognition. Our results demonstrate the effectiveness of our architecture and show that it is a promising solution in the quest of making robots more autonomous in their learning process.
Abstract:Skilled motor behavior is critical in many human daily life activities and professions. The design of robots that can effectively teach motor skills is an important challenge in the robotics field. In particular, it is important to understand whether the involvement in the training of a robot exhibiting social behaviors impacts on the learning and the experience of the human pupils. In this study, we addressed this question and we asked participants to learn a complex task - stabilizing an inverted pendulum - by training with physical assistance provided by a robotic manipulandum, the Wristbot. One group of participants performed the training only using the Wristbot, whereas for another group the same physical assistance was attributed to the humanoid robot iCub, who played the role of an expert trainer and exhibited also some social behaviors. The results obtained show that participants of both groups effectively acquired the skill by leveraging the physical assistance, as they significantly improved their stabilization performance even when the assistance was removed. Moreover, learning in a context of interaction with a humanoid robot assistant led subjects to increased motivation and more enjoyable training experience, without negative effects on attention and perceived effort. With the experimental approach presented in this study, it is possible to investigate the relative contribution of haptic and social signals in the context of motor learning mediated by human-robot interaction, with the aim of developing effective robot trainers.