Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Paul Hungler

CLARE: Cognitive Load Assessment in REaltime with Multimodal Data

Apr 26, 2024

Anubhav Bhatti, Prithila Angkan, Behnam Behinaein, Zunayed Mahmud, Dirk Rodenburg, Heather Braund, P. James Mclellan, Aaron Ruberto, Geoffery Harrison, Daryl Wilson(+4 more)

Figure 1 for CLARE: Cognitive Load Assessment in REaltime with Multimodal Data

Figure 2 for CLARE: Cognitive Load Assessment in REaltime with Multimodal Data

Figure 3 for CLARE: Cognitive Load Assessment in REaltime with Multimodal Data

Figure 4 for CLARE: Cognitive Load Assessment in REaltime with Multimodal Data

Abstract:We present a novel multimodal dataset for Cognitive Load Assessment in REaltime (CLARE). The dataset contains physiological and gaze data from 24 participants with self-reported cognitive load scores as ground-truth labels. The dataset consists of four modalities, namely, Electrocardiography (ECG), Electrodermal Activity (EDA), Electroencephalogram (EEG), and Gaze tracking. To map diverse levels of mental load on participants during experiments, each participant completed four nine-minutes sessions on a computer-based operator performance and mental workload task (the MATB-II software) with varying levels of complexity in one minute segments. During the experiment, participants reported their cognitive load every 10 seconds. For the dataset, we also provide benchmark binary classification results with machine learning and deep learning models on two different evaluation schemes, namely, 10-fold and leave-one-subject-out (LOSO) cross-validation. Benchmark results show that for 10-fold evaluation, the convolutional neural network (CNN) based deep learning model achieves the best classification performance with ECG, EDA, and Gaze. In contrast, for LOSO, the best performance is achieved by the deep learning model with ECG, EDA, and EEG.

* 12 pages, 10 figures, 6 tables

Via

Access Paper or Ask Questions

EEG-based Cognitive Load Classification using Feature Masked Autoencoding and Emotion Transfer Learning

Aug 01, 2023

Dustin Pulver, Prithila Angkan, Paul Hungler, Ali Etemad

Abstract:Cognitive load, the amount of mental effort required for task completion, plays an important role in performance and decision-making outcomes, making its classification and analysis essential in various sensitive domains. In this paper, we present a new solution for the classification of cognitive load using electroencephalogram (EEG). Our model uses a transformer architecture employing transfer learning between emotions and cognitive load. We pre-train our model using self-supervised masked autoencoding on emotion-related EEG datasets and use transfer learning with both frozen weights and fine-tuning to perform downstream cognitive load classification. To evaluate our method, we carry out a series of experiments utilizing two publicly available EEG-based emotion datasets, namely SEED and SEED-IV, for pre-training, while we use the CL-Drive dataset for downstream cognitive load classification. The results of our experiments show that our proposed approach achieves strong results and outperforms conventional single-stage fully supervised learning. Moreover, we perform detailed ablation and sensitivity studies to evaluate the impact of different aspects of our proposed solution. This research contributes to the growing body of literature in affective computing with a focus on cognitive load, and opens up new avenues for future research in the field of cross-domain transfer learning using self-supervised pre-training.

* This paper has been accepted to the 25th International Conference on Multimodal Interaction (ICMI 2023). 8 pages, 6 figures, 6 tables

Via

Access Paper or Ask Questions

Multimodal Brain-Computer Interface for In-Vehicle Driver Cognitive Load Measurement: Dataset and Baselines

Apr 09, 2023

Prithila Angkan, Behnam Behinaein, Zunayed Mahmud, Anubhav Bhatti, Dirk Rodenburg, Paul Hungler, Ali Etemad

Abstract:Through this paper, we introduce a novel driver cognitive load assessment dataset, CL-Drive, which contains Electroencephalogram (EEG) signals along with other physiological signals such as Electrocardiography (ECG) and Electrodermal Activity (EDA) as well as eye tracking data. The data was collected from 21 subjects while driving in an immersive vehicle simulator, in various driving conditions, to induce different levels of cognitive load in the subjects. The tasks consisted of 9 complexity levels for 3 minutes each. Each driver reported their subjective cognitive load every 10 seconds throughout the experiment. The dataset contains the subjective cognitive load recorded as ground truth. In this paper, we also provide benchmark classification results for different machine learning and deep learning models for both binary and ternary label distributions. We followed 2 evaluation criteria namely 10-fold and leave-one-subject-out (LOSO). We have trained our models on both hand-crafted features as well as on raw data.

* 13 pages, 8 figures, 11 tables. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice

Via

Access Paper or Ask Questions

Multistream Gaze Estimation with Anatomical Eye Region Isolation by Synthetic to Real Transfer Learning

Jun 18, 2022

Zunayed Mahmud, Paul Hungler, Ali Etemad

Figure 1 for Multistream Gaze Estimation with Anatomical Eye Region Isolation by Synthetic to Real Transfer Learning

Figure 2 for Multistream Gaze Estimation with Anatomical Eye Region Isolation by Synthetic to Real Transfer Learning

Figure 3 for Multistream Gaze Estimation with Anatomical Eye Region Isolation by Synthetic to Real Transfer Learning

Figure 4 for Multistream Gaze Estimation with Anatomical Eye Region Isolation by Synthetic to Real Transfer Learning

Abstract:We propose a novel neural pipeline, MSGazeNet, that learns gaze representations by taking advantage of the eye anatomy information through a multistream framework. Our proposed solution comprises two components, first a network for isolating anatomical eye regions, and a second network for multistream gaze estimation. The eye region isolation is performed with a U-Net style network which we train using a synthetic dataset that contains eye region masks for the visible eyeball and the iris region. The synthetic dataset used in this stage is a new dataset consisting of 60,000 eye images, which we create using an eye-gaze simulator, UnityEyes. Successive to training, the eye region isolation network is then transferred to the real domain for generating masks for the real-world eye images. In order to successfully make the transfer, we exploit domain randomization in the training process, which allows for the synthetic images to benefit from a larger variance with the help of augmentations that resemble artifacts. The generated eye region masks along with the raw eye images are then used together as a multistream input to our gaze estimation network. We evaluate our framework on three benchmark gaze estimation datasets, MPIIGaze, Eyediap, and UTMultiview, where we set a new state-of-the-art on Eyediap and UTMultiview datasets by obtaining a performance gain of 7.57% and 1.85% respectively, while achieving competitive performance on MPIIGaze. We also study the robustness of our method with respect to the noise in the data and demonstrate that our model is less sensitive to noisy data. Lastly, we perform a variety of experiments including ablation studies to evaluate the contribution of different components and design choices in our solution.

* 14 pages, 10 figures, 12 tables. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice

Via

Access Paper or Ask Questions

AttX: Attentive Cross-Connections for Fusion of Wearable Signals in Emotion Recognition

Jun 09, 2022

Anubhav Bhatti, Behnam Behinaein, Paul Hungler, Ali Etemad

Figure 1 for AttX: Attentive Cross-Connections for Fusion of Wearable Signals in Emotion Recognition

Figure 2 for AttX: Attentive Cross-Connections for Fusion of Wearable Signals in Emotion Recognition

Figure 3 for AttX: Attentive Cross-Connections for Fusion of Wearable Signals in Emotion Recognition

Figure 4 for AttX: Attentive Cross-Connections for Fusion of Wearable Signals in Emotion Recognition

Abstract:We propose cross-modal attentive connections, a new dynamic and effective technique for multimodal representation learning from wearable data. Our solution can be integrated into any stage of the pipeline, i.e., after any convolutional layer or block, to create intermediate connections between individual streams responsible for processing each modality. Additionally, our method benefits from two properties. First, it can share information uni-directionally (from one modality to the other) or bi-directionally. Second, it can be integrated into multiple stages at the same time to further allow network gradients to be exchanged in several touch-points. We perform extensive experiments on three public multimodal wearable datasets, WESAD, SWELL-KW, and CASE, and demonstrate that our method can effectively regulate and share information between different modalities to learn better representations. Our experiments further demonstrate that once integrated into simple CNN-based multimodal solutions (2, 3, or 4 modalities), our method can result in superior or competitive performance to state-of-the-art and outperform a variety of baseline uni-modal and classical multimodal methods.

* 13 pages, 8 figures

Via

Access Paper or Ask Questions

Gaze Estimation with Eye Region Segmentation and Self-Supervised Multistream Learning

Dec 15, 2021

Zunayed Mahmud, Paul Hungler, Ali Etemad

Figure 1 for Gaze Estimation with Eye Region Segmentation and Self-Supervised Multistream Learning

Figure 2 for Gaze Estimation with Eye Region Segmentation and Self-Supervised Multistream Learning

Figure 3 for Gaze Estimation with Eye Region Segmentation and Self-Supervised Multistream Learning

Figure 4 for Gaze Estimation with Eye Region Segmentation and Self-Supervised Multistream Learning

Abstract:We present a novel multistream network that learns robust eye representations for gaze estimation. We first create a synthetic dataset containing eye region masks detailing the visible eyeball and iris using a simulator. We then perform eye region segmentation with a U-Net type model which we later use to generate eye region masks for real-world eye images. Next, we pretrain an eye image encoder in the real domain with self-supervised contrastive learning to learn generalized eye representations. Finally, this pretrained eye encoder, along with two additional encoders for visible eyeball region and iris, are used in parallel in our multistream framework to extract salient features for gaze estimation from real-world images. We demonstrate the performance of our method on the EYEDIAP dataset in two different evaluation settings and achieve state-of-the-art results, outperforming all the existing benchmarks on this dataset. We also conduct additional experiments to validate the robustness of our self-supervised network with respect to different amounts of labeled data used for training.

* 5 pages, 1 figure, 3 tables, Accepted in AAAI-22 Workshop on Human-Centric Self-Supervised Learning

Via

Access Paper or Ask Questions

A Transformer Architecture for Stress Detection from ECG

Aug 22, 2021

Behnam Behinaein, Anubhav Bhatti, Dirk Rodenburg, Paul Hungler, Ali Etemad

Figure 1 for A Transformer Architecture for Stress Detection from ECG

Figure 2 for A Transformer Architecture for Stress Detection from ECG

Figure 3 for A Transformer Architecture for Stress Detection from ECG

Abstract:Electrocardiogram (ECG) has been widely used for emotion recognition. This paper presents a deep neural network based on convolutional layers and a transformer mechanism to detect stress using ECG signals. We perform leave-one-subject-out experiments on two publicly available datasets, WESAD and SWELL-KW, to evaluate our method. Our experiments show that the proposed model achieves strong results, comparable or better than the state-of-the-art models for ECG-based stress detection on these two datasets. Moreover, our method is end-to-end, does not require handcrafted features, and can learn robust representations with only a few convolutional blocks and the transformer component.

* Accepted by 2021 International Symposium on Wearable Computers (ISWC)

Via

Access Paper or Ask Questions

Attentive Cross-modal Connections for Deep Multimodal Wearable-based Emotion Recognition

Aug 04, 2021

Anubhav Bhatti, Behnam Behinaein, Dirk Rodenburg, Paul Hungler, Ali Etemad

Figure 1 for Attentive Cross-modal Connections for Deep Multimodal Wearable-based Emotion Recognition

Figure 2 for Attentive Cross-modal Connections for Deep Multimodal Wearable-based Emotion Recognition

Figure 3 for Attentive Cross-modal Connections for Deep Multimodal Wearable-based Emotion Recognition

Figure 4 for Attentive Cross-modal Connections for Deep Multimodal Wearable-based Emotion Recognition

Abstract:Classification of human emotions can play an essential role in the design and improvement of human-machine systems. While individual biological signals such as Electrocardiogram (ECG) and Electrodermal Activity (EDA) have been widely used for emotion recognition with machine learning methods, multimodal approaches generally fuse extracted features or final classification/regression results to boost performance. To enhance multimodal learning, we present a novel attentive cross-modal connection to share information between convolutional neural networks responsible for learning individual modalities. Specifically, these connections improve emotion classification by sharing intermediate representations among EDA and ECG and apply attention weights to the shared information, thus learning more effective multimodal embeddings. We perform experiments on the WESAD dataset to identify the best configuration of the proposed method for emotion classification. Our experiments show that the proposed approach is capable of learning strong multimodal representations and outperforms a number of baselines methods.

* 5 pages, 2 figures. Accepted at 2021 9th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)

Via

Access Paper or Ask Questions

Unsupervised Multi-Modal Representation Learning for Affective Computing with Multi-Corpus Wearable Data

Aug 24, 2020

Kyle Ross, Paul Hungler, Ali Etemad

Figure 1 for Unsupervised Multi-Modal Representation Learning for Affective Computing with Multi-Corpus Wearable Data

Figure 2 for Unsupervised Multi-Modal Representation Learning for Affective Computing with Multi-Corpus Wearable Data

Figure 3 for Unsupervised Multi-Modal Representation Learning for Affective Computing with Multi-Corpus Wearable Data

Figure 4 for Unsupervised Multi-Modal Representation Learning for Affective Computing with Multi-Corpus Wearable Data

Abstract:With recent developments in smart technologies, there has been a growing focus on the use of artificial intelligence and machine learning for affective computing to further enhance the user experience through emotion recognition. Typically, machine learning models used for affective computing are trained using manually extracted features from biological signals. Such features may not generalize well for large datasets and may be sub-optimal in capturing the information from the raw input data. One approach to address this issue is to use fully supervised deep learning methods to learn latent representations of the biosignals. However, this method requires human supervision to label the data, which may be unavailable or difficult to obtain. In this work we propose an unsupervised framework reduce the reliance on human supervision. The proposed framework utilizes two stacked convolutional autoencoders to learn latent representations from wearable electrocardiogram (ECG) and electrodermal activity (EDA) signals. These representations are utilized within a random forest model for binary arousal classification. This approach reduces human supervision and enables the aggregation of datasets allowing for higher generalizability. To validate this framework, an aggregated dataset comprised of the AMIGOS, ASCERTAIN, CLEAS, and MAHNOB-HCI datasets is created. The results of our proposed method are compared with using convolutional neural networks, as well as methods that employ manual extraction of hand-crafted features. The methodology used for fusing the two modalities is also investigated. Lastly, we show that our method outperforms current state-of-the-art results that have performed arousal detection on the same datasets using ECG and EDA biosignals. The results show the wide-spread applicability for stacked convolutional autoencoders to be used with machine learning for affective computing.

* 16 pages,5 figures

Via

Access Paper or Ask Questions

Classification of Cognitive Load and Expertise for Adaptive Simulation using Deep Multitask Learning

Jul 31, 2019

Pritam Sarkar, Kyle Ross, Aaron J. Ruberto, Dirk Rodenburg, Paul Hungler, Ali Etemad

Figure 1 for Classification of Cognitive Load and Expertise for Adaptive Simulation using Deep Multitask Learning

Figure 2 for Classification of Cognitive Load and Expertise for Adaptive Simulation using Deep Multitask Learning

Figure 3 for Classification of Cognitive Load and Expertise for Adaptive Simulation using Deep Multitask Learning

Figure 4 for Classification of Cognitive Load and Expertise for Adaptive Simulation using Deep Multitask Learning

Abstract:Simulations are a pedagogical means of enabling a risk-free way for healthcare practitioners to learn, maintain, or enhance their knowledge and skills. Such simulations should provide an optimum amount of cognitive load to the learner and be tailored to their levels of expertise. However, most current simulations are a one-type-fits-all tool used to train different learners regardless of their existing skills, expertise, and ability to handle cognitive load. To address this problem, we propose an end-to-end framework for a trauma simulation that actively classifies a participant's level of cognitive load and expertise for the development of a dynamically adaptive simulation. To facilitate this solution, trauma simulations were developed for the collection of electrocardiogram (ECG) signals of both novice and expert practitioners. A multitask deep neural network was developed to utilize this data and classify high and low cognitive load, as well as expert and novice participants. A leave-one-subject-out (LOSO) validation was used to evaluate the effectiveness of our model, achieving an accuracy of 89.4% and 96.6% for classification of cognitive load and expertise, respectively.

* 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Via

Access Paper or Ask Questions