Abstract:Our research project (CHATTERS) is about designing a conversational robot for children's digital information search. We want to design a robot with a suitable conversation, that fosters a responsible trust relationship between child and robot. In this paper we give: 1) a preliminary view on an empirical study around children's trust in robots that provide information, which was conducted via video call due to the COVID-19 pandemic. 2) We also give a preliminary analysis of a co-design workshop we conducted, where the pandemic may have impacted children's design choices. (3) We close by describing the upcoming research activities we are developing.
Abstract:Robot-assisted therapy is an emerging form of therapy for autistic children, although designing effective robot behaviors is a challenge for effective implementation of such therapy. A series of usability tests assessed trends in the effectiveness of modelling a robot's facial expressions on realistic facial expressions and of adding peripherals enabling child-led control of emotion learning activities with autistic children. Nineteen autistic children interacted with a small humanoid robot and an adult therapist in several emotion-learning activities that featured realistic facial expressions modelled on either a pre-existing database or live facial mirroring, and that used peripherals (tablets or tangible 'squishies') to enable child-led activities. Both types of realistic facial expressions by the robot were less effective than exaggerated expressions, with the mirroring being unintuitive for children. The tablet was usable but required more feedback and lower latency, while the tactile tangibles were engaging aids.
Abstract:This paper describes the initial steps towards the design of a robotic system that intends to perform actions autonomously in a naturalistic play environment. At the same time it aims for social human-robot interaction~(HRI), focusing on children. We draw on existing theories of child development and on dimensional models of emotions to explore the design of a dynamic interaction framework for natural child-robot interaction. In this dynamic setting, the social HRI is defined by the ability of the system to take into consideration the socio-emotional state of the user and to plan appropriately by selecting appropriate strategies for execution. The robot needs a temporal planning system, which combines features of task-oriented actions and principles of social human robot interaction. We present initial results of an empirical study for the evaluation of the proposed framework in the context of a collaborative sorting game.
Abstract:In this paper, we propose to use deep 3-dimensional convolutional networks (3D CNNs) in order to address the challenge of modelling spectro-temporal dynamics for speech emotion recognition (SER). Compared to a hybrid of Convolutional Neural Network and Long-Short-Term-Memory (CNN-LSTM), our proposed 3D CNNs simultaneously extract short-term and long-term spectral features with a moderate number of parameters. We evaluated our proposed and other state-of-the-art methods in a speaker-independent manner using aggregated corpora that give a large and diverse set of speakers. We found that 1) shallow temporal and moderately deep spectral kernels of a homogeneous architecture are optimal for the task; and 2) our 3D CNNs are more effective for spectro-temporal feature learning compared to other methods. Finally, we visualised the feature space obtained with our proposed method using t-distributed stochastic neighbour embedding (T-SNE) and could observe distinct clusters of emotions.
Abstract:One of the challenges in Speech Emotion Recognition (SER) "in the wild" is the large mismatch between training and test data (e.g. speakers and tasks). In order to improve the generalisation capabilities of the emotion models, we propose to use Multi-Task Learning (MTL) and use gender and naturalness as auxiliary tasks in deep neural networks. This method was evaluated in within-corpus and various cross-corpus classification experiments that simulate conditions "in the wild". In comparison to Single-Task Learning (STL) based state of the art methods, we found that our MTL method proposed improved performance significantly. Particularly, models using both gender and naturalness achieved more gains than those using either gender or naturalness separately. This benefit was also found in the high-level representations of the feature space, obtained from our method proposed, where discriminative emotional clusters could be observed.