Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Suleyman Ozdel

Examining the Role of LLM-Driven Interactions on Attention and Cognitive Engagement in Virtual Classrooms

May 12, 2025

Suleyman Ozdel, Can Sarpkaya, Efe Bozkir, Hong Gao, Enkelejda Kasneci

Abstract:Transforming educational technologies through the integration of large language models (LLMs) and virtual reality (VR) offers the potential for immersive and interactive learning experiences. However, the effects of LLMs on user engagement and attention in educational environments remain open questions. In this study, we utilized a fully LLM-driven virtual learning environment, where peers and teachers were LLM-driven, to examine how students behaved in such settings. Specifically, we investigate how peer question-asking behaviors influenced student engagement, attention, cognitive load, and learning outcomes and found that, in conditions where LLM-driven peer learners asked questions, students exhibited more targeted visual scanpaths, with their attention directed toward the learning content, particularly in complex subjects. Our results suggest that peer questions did not introduce extraneous cognitive load directly, as the cognitive load is strongly correlated with increased attention to the learning material. Considering these findings, we provide design recommendations for optimizing VR learning spaces.

* Accepted to EDM 2025 (Eighteenth International Conference on Educational Data Mining)

Via

Access Paper or Ask Questions

A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos

Apr 10, 2024

Suleyman Ozdel, Yao Rong, Berat Mert Albaba, Yen-Ling Kuo, Xi Wang, Enkelejda Kasneci

Figure 1 for A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos

Figure 2 for A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos

Figure 3 for A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos

Figure 4 for A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos

Abstract:Eye-tracking applications that utilize the human gaze in video understanding tasks have become increasingly important. To effectively automate the process of video analysis based on eye-tracking data, it is important to accurately replicate human gaze behavior. However, this task presents significant challenges due to the inherent complexity and ambiguity of human gaze patterns. In this work, we introduce a novel method for simulating human gaze behavior. Our approach uses a transformer-based reinforcement learning algorithm to train an agent that acts as a human observer, with the primary role of watching videos and simulating human gaze behavior. We employed an eye-tracking dataset gathered from videos generated by the VirtualHome simulator, with a primary focus on activity recognition. Our experimental results demonstrate the effectiveness of our gaze prediction method by highlighting its capability to replicate human gaze behavior and its applicability for downstream tasks where real human-gaze is used as input.

* 2024 Symposium on Eye Tracking Research and Applications (ETRA24), Glasgow, United Kingdom

Via

Access Paper or Ask Questions

Gaze-Guided Graph Neural Network for Action Anticipation Conditioned on Intention

Apr 10, 2024

Suleyman Ozdel, Yao Rong, Berat Mert Albaba, Yen-Ling Kuo, Xi Wang, Enkelejda Kasneci

Abstract:Humans utilize their gaze to concentrate on essential information while perceiving and interpreting intentions in videos. Incorporating human gaze into computational algorithms can significantly enhance model performance in video understanding tasks. In this work, we address a challenging and innovative task in video understanding: predicting the actions of an agent in a video based on a partial video. We introduce the Gaze-guided Action Anticipation algorithm, which establishes a visual-semantic graph from the video input. Our method utilizes a Graph Neural Network to recognize the agent's intention and predict the action sequence to fulfill this intention. To assess the efficiency of our approach, we collect a dataset containing household activities generated in the VirtualHome environment, accompanied by human gaze data of viewing videos. Our method outperforms state-of-the-art techniques, achieving a 7\% improvement in accuracy for 18-class intention recognition. This highlights the efficiency of our method in learning important features from human gaze data.

* 2024 Symposium on Eye Tracking Research and Applications (ETRA24), Glasgow, United Kingdom

Via

Access Paper or Ask Questions