Abstract:The limited size of pain datasets are a challenge in developing robust deep learning models for pain recognition. Transfer learning approaches are often employed in these scenarios. In this study, we investigate whether deep learned feature representation for one type of experimentally induced pain can be transferred to another. Participating in the AI4Pain challenge, our goal is to classify three levels of pain (No-Pain, Low-Pain, High-Pain). The challenge dataset contains data collected from 65 participants undergoing varying intensities of electrical pain. We utilize the video recording from the dataset to investigate the transferability of deep learned heat pain model to electrical pain. In our proposed approach, we leverage an existing heat pain convolutional neural network (CNN) - trained on BioVid dataset - as a feature extractor. The images from the challenge dataset are inputted to the pre-trained heat pain CNN to obtain feature vectors. These feature vectors are used to train two machine learning models: a simple feed-forward neural network and a long short-term memory (LSTM) network. Our approach was tested using the dataset's predefined training, validation, and testing splits. Our models outperformed the baseline of the challenge on both the validation and tests sets, highlighting the potential of models trained on other pain datasets for reliable feature extraction.
Abstract:Individuals with diverse motor abilities often benefit from intensive and specialized rehabilitation therapies aimed at enhancing their functional recovery. Nevertheless, the challenge lies in the restricted availability of neurorehabilitation professionals, hindering the effective delivery of the necessary level of care. Robotic devices hold great potential in reducing the dependence on medical personnel during therapy but, at the same time, they generally lack the crucial human interaction and motivation that traditional in-person sessions provide. To bridge this gap, we introduce an AI-based system aimed at delivering personalized, out-of-hospital assistance during neurorehabilitation training. This system includes a rehabilitation training device, affective signal classification models, training exercises, and a socially interactive agent as the user interface. With the assistance of a professional, the envisioned system is designed to be tailored to accommodate the unique rehabilitation requirements of an individual patient. Conceptually, after a preliminary setup and instruction phase, the patient is equipped to continue their rehabilitation regimen autonomously in the comfort of their home, facilitated by a socially interactive agent functioning as a virtual coaching assistant. Our approach involves the integration of an interactive socially-aware virtual agent into a neurorehabilitation robotic framework, with the primary objective of recreating the social aspects inherent to in-person rehabilitation sessions. We also conducted a feasibility study to test the framework with healthy patients. The results of our preliminary investigation indicate that participants demonstrated a propensity to adapt to the system. Notably, the presence of the interactive agent during the proposed exercises did not act as a source of distraction; instead, it positively impacted users' engagement.
Abstract:Automatic stress detection using heart rate variability (HRV) features has gained significant traction as it utilizes unobtrusive wearable sensors measuring signals like electrocardiogram (ECG) or blood volume pulse (BVP). However, detecting stress through such physiological signals presents a considerable challenge owing to the variations in recorded signals influenced by factors, such as perceived stress intensity and measurement devices. Consequently, stress detection models developed on one dataset may perform poorly on unseen data collected under different conditions. To address this challenge, this study explores the generalizability of machine learning models trained on HRV features for binary stress detection. Our goal extends beyond evaluating generalization performance; we aim to identify the characteristics of datasets that have the most significant influence on generalizability. We leverage four publicly available stress datasets (WESAD, SWELL-KW, ForDigitStress, VerBIO) that vary in at least one of the characteristics such as stress elicitation techniques, stress intensity, and sensor devices. Employing a cross-dataset evaluation approach, we explore which of these characteristics strongly influence model generalizability. Our findings reveal a crucial factor affecting model generalizability: stressor type. Models achieved good performance across datasets when the type of stressor (e.g., social stress in our case) remains consistent. Factors like stress intensity or brand of the measurement device had minimal impact on cross-dataset performance. Based on our findings, we recommend matching the stressor type when deploying HRV-based stress models in new environments. To the best of our knowledge, this is the first study to systematically investigate factors influencing the cross-dataset applicability of HRV-based stress models.
Abstract:In industrial scenarios, there is widespread use of collaborative robots (cobots), and growing interest is directed at evaluating and measuring the impact of some characteristics of the cobot on the human factor. In the present pilot study, the effect that the production rhythm (C1 - Slow, C2 - Fast, C3 - Adapted to the participant's pace) of a cobot has on the Experiential Locus of Control (ELoC) and the emotional state of 31 participants has been examined. The operators' performance, the degree of basic internal Locus of Control, and the attitude towards the robots were also considered. No difference was found regarding the emotional state and the ELoC in the three conditions, but considering the other psychological variables, a more complex situation emerges. Overall, results seem to indicate a need to consider the person's psychological characteristics to offer a differentiated and optimal interaction experience.
Abstract:Collaborative robots (cobots) are widely used in industrial applications, yet extensive research is still needed to enhance human-robot collaborations and operator experience. A potential approach to improve the collaboration experience involves adapting cobot behavior based on natural cues from the operator. Inspired by the literature on human-human interactions, we conducted a wizard-of-oz study to examine whether a gaze towards the cobot can serve as a trigger for initiating joint activities in collaborative sessions. In this study, 37 participants engaged in an assembly task while their gaze behavior was analyzed. We employ a gaze-based attention recognition model to identify when the participants look at the cobot. Our results indicate that in most cases (84.88\%), the joint activity is preceded by a gaze towards the cobot. Furthermore, during the entire assembly cycle, the participants tend to look at the cobot around the time of the joint activity. To the best of our knowledge, this is the first study to analyze the natural gaze behavior of participants working on a joint activity with a robot during a collaborative assembly task.
Abstract:Attention (and distraction) recognition is a key factor in improving human-robot collaboration. We present an assembly scenario where a human operator and a cobot collaborate equally to piece together a gearbox. The setup provides multiple opportunities for the cobot to adapt its behavior depending on the operator's attention, which can improve the collaboration experience and reduce psychological strain. As a first step, we recognize the areas in the workspace that the human operator is paying attention to, and consequently, detect when the operator is distracted. We propose a novel deep-learning approach to develop an attention recognition model. First, we train a convolutional neural network to estimate the gaze direction using a publicly available image dataset. Then, we use transfer learning with a small dataset to map the gaze direction onto pre-defined areas of interest. Models trained using this approach performed very well in leave-one-subject-out evaluation on the small dataset. We performed an additional validation of our models using the video snippets collected from participants working as an operator in the presented assembly scenario. Although the recall for the Distracted class was lower in this case, the models performed well in recognizing the areas the operator paid attention to. To the best of our knowledge, this is the first work that validated an attention recognition model using data from a setting that mimics industrial human-robot collaboration. Our findings highlight the need for validation of attention recognition solutions in such full-fledged, non-guided scenarios.
Abstract:We present a multi-modal stress dataset that uses digital job interviews to induce stress. The dataset provides multi-modal data of 40 participants including audio, video (motion capturing, facial recognition, eye tracking) as well as physiological information (photoplethysmography, electrodermal activity). In addition to that, the dataset contains time-continuous annotations for stress and occurred emotions (e.g. shame, anger, anxiety, surprise). In order to establish a baseline, five different machine learning classifiers (Support Vector Machine, K-Nearest Neighbors, Random Forest, Long-Short-Term Memory Network) have been trained and evaluated on the proposed dataset for a binary stress classification task. The best-performing classifier achieved an accuracy of 88.3% and an F1-score of 87.5%.
Abstract:Stress is prevalent in many aspects of everyday life including work, healthcare, and social interactions. Many works have studied handcrafted features from various bio-signals that are indicators of stress. Recently, deep learning models have also been proposed to detect stress. Typically, stress models are trained and validated on the same dataset, often involving one stressful scenario. However, it is not practical to collect stress data for every scenario. So, it is crucial to study the generalizability of these models and determine to what extent they can be used in other scenarios. In this paper, we explore the generalization capabilities of Electrocardiogram (ECG)-based deep learning models and models based on handcrafted ECG features, i.e., Heart Rate Variability (HRV) features. To this end, we train three HRV models and two deep learning models that use ECG signals as input. We use ECG signals from two popular stress datasets - WESAD and SWELL-KW - differing in terms of stressors and recording devices. First, we evaluate the models using leave-one-subject-out (LOSO) cross-validation using training and validation samples from the same dataset. Next, we perform a cross-dataset validation of the models, that is, LOSO models trained on the WESAD dataset are validated using SWELL-KW samples and vice versa. While deep learning models achieve the best results on the same dataset, models based on HRV features considerably outperform them on data from a different dataset. This trend is observed for all the models on both datasets. Therefore, HRV models are a better choice for stress recognition in applications that are different from the dataset scenario. To the best of our knowledge, this is the first work to compare the cross-dataset generalizability between ECG-based deep learning models and HRV models.
Abstract:In today's world, many patients with cognitive impairments and motor dysfunction seek the attention of experts to perform specific conventional therapies to improve their situation. However, due to a lack of neurorehabilitation professionals, patients suffer from severe effects that worsen their condition. In this paper, we present a technological approach for a novel robotic neurorehabilitation training system. It relies on a combination of a rehabilitation device, signal classification methods, supervised machine learning models for training adaptation, training exercises, and socially interactive agents as a user interface. Together with a professional, the system can be trained towards the patient's specific needs. Furthermore, after a training phase, patients are enabled to train independently at home without the assistance of a physical therapist with a socially interactive agent in the role of a coaching assistant.
Abstract:In this paper, we present a process to investigate the effects of transfer learning for automatic facial expression recognition from emotions to pain. To this end, we first train a VGG16 convolutional neural network to automatically discern between eight categorical emotions. We then fine-tune successively larger parts of this network to learn suitable representations for the task of automatic pain recognition. Subsequently, we apply those fine-tuned representations again to the original task of emotion recognition to further investigate the differences in performance between the models. In the second step, we use Layer-wise Relevance Propagation to analyze predictions of the model that have been predicted correctly previously but are now wrongly classified. Based on this analysis, we rely on the visual inspection of a human observer to generate hypotheses about what has been forgotten by the model. Finally, we test those hypotheses quantitatively utilizing concept embedding analysis methods. Our results show that the network, which was fully fine-tuned for pain recognition, indeed payed less attention to two action units that are relevant for expression recognition but not for pain recognition.