Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Roberto Daza

MOSAIC-F: A Framework for Enhancing Students' Oral Presentation Skills through Personalized Feedback

Jun 10, 2025

Alvaro Becerra, Daniel Andres, Pablo Villegas, Roberto Daza, Ruth Cobos

Abstract:In this article, we present a novel multimodal feedback framework called MOSAIC-F, an acronym for a data-driven Framework that integrates Multimodal Learning Analytics (MMLA), Observations, Sensors, Artificial Intelligence (AI), and Collaborative assessments for generating personalized feedback on student learning activities. This framework consists of four key steps. First, peers and professors' assessments are conducted through standardized rubrics (that include both quantitative and qualitative evaluations). Second, multimodal data are collected during learning activities, including video recordings, audio capture, gaze tracking, physiological signals (heart rate, motion data), and behavioral interactions. Third, personalized feedback is generated using AI, synthesizing human-based evaluations and data-based multimodal insights such as posture, speech patterns, stress levels, and cognitive load, among others. Finally, students review their own performance through video recordings and engage in self-assessment and feedback visualization, comparing their own evaluations with peers and professors' assessments, class averages, and AI-generated recommendations. By combining human-based and data-based evaluation techniques, this framework enables more accurate, personalized and actionable feedback. We tested MOSAIC-F in the context of improving oral presentation skills.

* Accepted in LASI Spain 25: Learning Analytics Summer Institute Spain 2025

Via

Access Paper or Ask Questions

M2LADS Demo: A System for Generating Multimodal Learning Analytics Dashboards

Feb 21, 2025

Alvaro Becerra, Roberto Daza, Ruth Cobos, Aythami Morales, Julian Fierrez

Figure 1 for M2LADS Demo: A System for Generating Multimodal Learning Analytics Dashboards

Abstract:We present a demonstration of a web-based system called M2LADS ("System for Generating Multimodal Learning Analytics Dashboards"), designed to integrate, synchronize, visualize, and analyze multimodal data recorded during computer-based learning sessions with biosensors. This system presents a range of biometric and behavioral data on web-based dashboards, providing detailed insights into various physiological and activity-based metrics. The multimodal data visualized include electroencephalogram (EEG) data for assessing attention and brain activity, heart rate metrics, eye-tracking data to measure visual attention, webcam video recordings, and activity logs of the monitored tasks. M2LADS aims to assist data scientists in two key ways: (1) by providing a comprehensive view of participants' experiences, displaying all data categorized by the activities in which participants are engaged, and (2) by synchronizing all biosignals and videos, facilitating easier data relabeling if any activity information contains errors.

* Published in the Workshop on Innovation and Responsibility in AI-Supported Education (iRAISE25) at AAAI 2025

Via

Access Paper or Ask Questions

SMARTe-VR: Student Monitoring and Adaptive Response Technology for e-learning in Virtual Reality

Jan 19, 2025

Roberto Daza, Lin Shengkai, Aythami Morales, Julian Fierrez, Katashi Nagao

Abstract:This work introduces SMARTe-VR, a platform for student monitoring in an immersive virtual reality environment designed for online education. SMARTe-VR is aimed to gather data for adaptive learning, focusing on facial biometrics and learning metadata. The platform allows instructors to create tailored learning sessions with video lectures, featuring an interface with an Auto QA system to evaluate understanding, interaction tools (e.g., textbook highlighting and lecture tagging), and real-time feedback. Additionally, we release a dataset containing 5 research challenges with data from 10 users in VR-based TOEIC sessions. This dataset, spanning over 25 hours, includes facial features, learning metadata, 450 responses, question difficulty levels, concept tags, and understanding labels. Alongside the database, we present preliminary experiments using Item Response Theory models, adapted for understanding detection using facial features. Two architectures were explored: a Temporal Convolutional Network for local features and a Multilayer Perceptron for global features.

* Published in the Workshop on Artificial Intelligence for Education (AI4EDU) at AAAI 2025

Via

Access Paper or Ask Questions

IMPROVE: Impact of Mobile Phones on Remote Online Virtual Education

Dec 13, 2024

Roberto Daza, Alvaro Becerra, Ruth Cobos, Julian Fierrez, Aythami Morales

Abstract:This work presents the IMPROVE dataset, designed to evaluate the effects of mobile phone usage on learners during online education. The dataset not only assesses academic performance and subjective learner feedback but also captures biometric, behavioral, and physiological signals, providing a comprehensive analysis of the impact of mobile phone use on learning. Multimodal data were collected from 120 learners in three groups with different phone interaction levels. A setup involving 16 sensors was implemented to collect data that have proven to be effective indicators for understanding learner behavior and cognition, including electroencephalography waves, videos, eye tracker, etc. The dataset includes metadata from the processed videos like face bounding boxes, facial landmarks, and Euler angles for head pose estimation. In addition, learner performance data and self-reported forms are included. Phone usage events were labeled, covering both supervisor-triggered and uncontrolled events. A semi-manual re-labeling system, using head pose and eye tracker data, is proposed to improve labeling accuracy. Technical validation confirmed signal quality, with statistical analyses revealing biometric changes during phone use.

* Article under review in the journal Scientific Data. GitHub repository of the dataset at: https://github.com/BiDAlab/IMPROVE

Via

Access Paper or Ask Questions

DeepFace-Attention: Multimodal Face Biometrics for Attention Estimation with Application to e-Learning

Aug 14, 2024

Roberto Daza, Luis F. Gomez, Julian Fierrez, Aythami Morales, Ruben Tolosana, Javier Ortega-Garcia

Abstract:This work introduces an innovative method for estimating attention levels (cognitive load) using an ensemble of facial analysis techniques applied to webcam videos. Our method is particularly useful, among others, in e-learning applications, so we trained, evaluated, and compared our approach on the mEBAL2 database, a public multi-modal database acquired in an e-learning environment. mEBAL2 comprises data from 60 users who performed 8 different tasks. These tasks varied in difficulty, leading to changes in their cognitive loads. Our approach adapts state-of-the-art facial analysis technologies to quantify the users' cognitive load in the form of high or low attention. Several behavioral signals and physiological processes related to the cognitive load are used, such as eyeblink, heart rate, facial action units, and head pose, among others. Furthermore, we conduct a study to understand which individual features obtain better results, the most efficient combinations, explore local and global features, and how temporary time intervals affect attention level estimation, among other aspects. We find that global facial features are more appropriate for multimodal systems using score-level fusion, particularly as the temporal window increases. On the other hand, local features are more suitable for fusion through neural network training with score-level fusion approaches. Our method outperforms existing state-of-the-art accuracies using the public mEBAL2 benchmark.

* Article accepted in the IEEE Access journal. Accessible at https://ieeexplore.ieee.org/document/10633208

Via

Access Paper or Ask Questions

Visual Attention Analysis in Online Learning

May 31, 2024

Miriam Navarro, Álvaro Becerra, Roberto Daza, Ruth Cobos, Aythami Morales, Julian Fierrez

Figure 1 for Visual Attention Analysis in Online Learning

Figure 2 for Visual Attention Analysis in Online Learning

Figure 3 for Visual Attention Analysis in Online Learning

Figure 4 for Visual Attention Analysis in Online Learning

Abstract:In this paper, we present an approach in the Multimodal Learning Analytics field. Within this approach, we have developed a tool to visualize and analyze eye movement data collected during learning sessions in online courses. The tool is named VAAD (an acronym for Visual Attention Analysis Dashboard). These eye movement data have been gathered using an eye-tracker and subsequently processed and visualized for interpretation. The purpose of the tool is to conduct a descriptive analysis of the data by facilitating its visualization, enabling the identification of differences and learning patterns among various learner populations. Additionally, it integrates a predictive module capable of anticipating learner activities during a learning session. Consequently, VAAD holds the potential to offer valuable insights into online learning behaviors from both descriptive and predictive perspectives.

* Accepted in CEDI 2024 (VII Congreso Espa\~nol de Inform\'atica), A Coru\~na, Spain

Via

Access Paper or Ask Questions

Biometrics and Behavioral Modelling for Detecting Distractions in Online Learning

May 24, 2024

Álvaro Becerra, Javier Irigoyen, Roberto Daza, Ruth Cobos, Aythami Morales, Julian Fierrez, Mutlu Cukurova

Figure 1 for Biometrics and Behavioral Modelling for Detecting Distractions in Online Learning

Figure 2 for Biometrics and Behavioral Modelling for Detecting Distractions in Online Learning

Figure 3 for Biometrics and Behavioral Modelling for Detecting Distractions in Online Learning

Abstract:In this article, we explore computer vision approaches to detect abnormal head pose during e-learning sessions and we introduce a study on the effects of mobile phone usage during these sessions. We utilize behavioral data collected from 120 learners monitored while participating in a MOOC learning sessions. Our study focuses on the influence of phone-usage events on behavior and physiological responses, specifically attention, heart rate, and meditation, before, during, and after phone usage. Additionally, we propose an approach for estimating head pose events using images taken by the webcam during the MOOC learning sessions to detect phone-usage events. Our hypothesis suggests that head posture undergoes significant changes when learners interact with a mobile phone, contrasting with the typical behavior seen when learners face a computer during e-learning sessions. We propose an approach designed to detect deviations in head posture from the average observed during a learner's session, operating as a semi-supervised method. This system flags events indicating alterations in head posture for subsequent human review and selection of mobile phone usage occurrences with a sensitivity over 90%.

* Accepted in CEDI 2024 (VII Congreso Espa\~nol de Inform\'atica), A Coru\~na, Spain

Via

Access Paper or Ask Questions

mEBAL2 Database and Benchmark: Image-based Multispectral Eyeblink Detection

Sep 14, 2023

Roberto Daza, Aythami Morales, Julian Fierrez, Ruben Tolosana, Ruben Vera-Rodriguez

Abstract:This work introduces a new multispectral database and novel approaches for eyeblink detection in RGB and Near-Infrared (NIR) individual images. Our contributed dataset (mEBAL2, multimodal Eye Blink and Attention Level estimation, Version 2) is the largest existing eyeblink database, representing a great opportunity to improve data-driven multispectral approaches for blink detection and related applications (e.g., attention level estimation and presentation attack detection in face biometrics). mEBAL2 includes 21,100 image sequences from 180 different students (more than 2 million labeled images in total) while conducting a number of e-learning tasks of varying difficulty or taking a real course on HTML initiation through the edX MOOC platform. mEBAL2 uses multiple sensors, including two Near-Infrared (NIR) and one RGB camera to capture facial gestures during the execution of the tasks, as well as an Electroencephalogram (EEG) band to get the cognitive activity of the user and blinking events. Furthermore, this work proposes a Convolutional Neural Network architecture as benchmark for blink detection on mEBAL2 with performances up to 97%. Different training methodologies are implemented using the RGB spectrum, NIR spectrum, and the combination of both to enhance the performance on existing eyeblink detectors. We demonstrate that combining NIR and RGB images during training improves the performance of RGB eyeblink detectors (i.e., detection based only on a RGB image). Finally, the generalization capacity of the proposed eyeblink detectors is validated in wilder and more challenging environments like the HUST-LEBW dataset to show the usefulness of mEBAL2 to train a new generation of data-driven approaches for eyeblink detection.

* This paper is under consideration at Pattern Recognition Letters

Via

Access Paper or Ask Questions

M2LADS: A System for Generating MultiModal Learning Analytics Dashboards in Open Education

May 21, 2023

Álvaro Becerra, Roberto Daza, Ruth Cobos, Aythami Morales, Mutlu Cukurova, Julian Fierrez

Figure 1 for M2LADS: A System for Generating MultiModal Learning Analytics Dashboards in Open Education

Figure 2 for M2LADS: A System for Generating MultiModal Learning Analytics Dashboards in Open Education

Figure 3 for M2LADS: A System for Generating MultiModal Learning Analytics Dashboards in Open Education

Figure 4 for M2LADS: A System for Generating MultiModal Learning Analytics Dashboards in Open Education

Abstract:In this article, we present a Web-based System called M2LADS, which supports the integration and visualization of multimodal data recorded in learning sessions in a MOOC in the form of Web-based Dashboards. Based on the edBB platform, the multimodal data gathered contains biometric and behavioral signals including electroencephalogram data to measure learners' cognitive attention, heart rate for affective measures, visual attention from the video recordings. Additionally, learners' static background data and their learning performance measures are tracked using LOGCE and MOOC tracking logs respectively, and both are included in the Web-based System. M2LADS provides opportunities to capture learners' holistic experience during their interactions with the MOOC, which can in turn be used to improve their learning outcomes through feedback visualizations and interventions, as well as to enhance learning analytics models and improve the open content of the MOOC.

* Accepted in "Workshop on Open Education Resources (OER) of COMPSAC 2023"

Via

Access Paper or Ask Questions

MATT: Multimodal Attention Level Estimation for e-learning Platforms

Jan 22, 2023

Roberto Daza, Luis F. Gomez, Aythami Morales, Julian Fierrez, Ruben Tolosana, Ruth Cobos, Javier Ortega-Garcia

Figure 1 for MATT: Multimodal Attention Level Estimation for e-learning Platforms

Figure 2 for MATT: Multimodal Attention Level Estimation for e-learning Platforms

Figure 3 for MATT: Multimodal Attention Level Estimation for e-learning Platforms

Figure 4 for MATT: Multimodal Attention Level Estimation for e-learning Platforms

Abstract:This work presents a new multimodal system for remote attention level estimation based on multimodal face analysis. Our multimodal approach uses different parameters and signals obtained from the behavior and physiological processes that have been related to modeling cognitive load such as faces gestures (e.g., blink rate, facial actions units) and user actions (e.g., head pose, distance to the camera). The multimodal system uses the following modules based on Convolutional Neural Networks (CNNs): Eye blink detection, head pose estimation, facial landmark detection, and facial expression features. First, we individually evaluate the proposed modules in the task of estimating the student's attention level captured during online e-learning sessions. For that we trained binary classifiers (high or low attention) based on Support Vector Machines (SVM) for each module. Secondly, we find out to what extent multimodal score level fusion improves the attention level estimation. The mEBAL database is used in the experimental framework, a public multi-modal database for attention level estimation obtained in an e-learning environment that contains data from 38 users while conducting several e-learning tasks of variable difficulty (creating changes in student cognitive loads).

* Preprint of the paper presented to the Workshop on Artificial Intelligence for Education (AI4EDU) of AAAI 2023

Via

Access Paper or Ask Questions