Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

David R. Reich

PoTeC: A German Naturalistic Eye-tracking-while-reading Corpus

Mar 01, 2024

Deborah N. Jakobi, Thomas Kern, David R. Reich, Patrick Haller, Lena A. Jäger

Abstract:The Potsdam Textbook Corpus (PoTeC) is a naturalistic eye-tracking-while-reading corpus containing data from 75 participants reading 12 scientific texts. PoTeC is the first naturalistic eye-tracking-while-reading corpus that contains eye-movements from domain-experts as well as novices in a within-participant manipulation: It is based on a 2x2x2 fully-crossed factorial design which includes the participants' level of study and the participants' discipline of study as between-subject factors and the text domain as a within-subject factor. The participants' reading comprehension was assessed by a series of text comprehension questions and their domain knowledge was tested by text-independent background questions for each of the texts. The materials are annotated for a variety of linguistic features at different levels. We envision PoTeC to be used for a wide range of studies including but not limited to analyses of expert and non-expert reading strategies. The corpus and all the accompanying data at all stages of the preprocessing pipeline and all code used to preprocess the data are made available via GitHub: https://github.com/DiLi-Lab/PoTeC.

Via

Access Paper or Ask Questions

ScanDL: A Diffusion Model for Generating Synthetic Scanpaths on Texts

Oct 24, 2023

Lena S. Bolliger, David R. Reich, Patrick Haller, Deborah N. Jakobi, Paul Prasse, Lena A. Jäger

Figure 1 for ScanDL: A Diffusion Model for Generating Synthetic Scanpaths on Texts

Figure 2 for ScanDL: A Diffusion Model for Generating Synthetic Scanpaths on Texts

Figure 3 for ScanDL: A Diffusion Model for Generating Synthetic Scanpaths on Texts

Figure 4 for ScanDL: A Diffusion Model for Generating Synthetic Scanpaths on Texts

Abstract:Eye movements in reading play a crucial role in psycholinguistic research studying the cognitive mechanisms underlying human language processing. More recently, the tight coupling between eye movements and cognition has also been leveraged for language-related machine learning tasks such as the interpretability, enhancement, and pre-training of language models, as well as the inference of reader- and text-specific properties. However, scarcity of eye movement data and its unavailability at application time poses a major challenge for this line of research. Initially, this problem was tackled by resorting to cognitive models for synthesizing eye movement data. However, for the sole purpose of generating human-like scanpaths, purely data-driven machine-learning-based methods have proven to be more suitable. Following recent advances in adapting diffusion processes to discrete data, we propose ScanDL, a novel discrete sequence-to-sequence diffusion model that generates synthetic scanpaths on texts. By leveraging pre-trained word representations and jointly embedding both the stimulus text and the fixation sequence, our model captures multi-modal interactions between the two inputs. We evaluate ScanDL within- and across-dataset and demonstrate that it significantly outperforms state-of-the-art scanpath generation methods. Finally, we provide an extensive psycholinguistic analysis that underlines the model's ability to exhibit human-like reading behavior. Our implementation is made available at https://github.com/DiLi-Lab/ScanDL.

* EMNLP 2023

Via

Access Paper or Ask Questions

Pre-Trained Language Models Augmented with Synthetic Scanpaths for Natural Language Understanding

Oct 23, 2023

Shuwen Deng, Paul Prasse, David R. Reich, Tobias Scheffer, Lena A. Jäger

Figure 1 for Pre-Trained Language Models Augmented with Synthetic Scanpaths for Natural Language Understanding

Figure 2 for Pre-Trained Language Models Augmented with Synthetic Scanpaths for Natural Language Understanding

Figure 3 for Pre-Trained Language Models Augmented with Synthetic Scanpaths for Natural Language Understanding

Figure 4 for Pre-Trained Language Models Augmented with Synthetic Scanpaths for Natural Language Understanding

Abstract:Human gaze data offer cognitive information that reflects natural language comprehension. Indeed, augmenting language models with human scanpaths has proven beneficial for a range of NLP tasks, including language understanding. However, the applicability of this approach is hampered because the abundance of text corpora is contrasted by a scarcity of gaze data. Although models for the generation of human-like scanpaths during reading have been developed, the potential of synthetic gaze data across NLP tasks remains largely unexplored. We develop a model that integrates synthetic scanpath generation with a scanpath-augmented language model, eliminating the need for human gaze data. Since the model's error gradient can be propagated throughout all parts of the model, the scanpath generator can be fine-tuned to downstream tasks. We find that the proposed model not only outperforms the underlying language model, but achieves a performance that is comparable to a language model augmented with real human gaze data. Our code is publicly available.

* Pre-print for EMNLP 2023

Via

Access Paper or Ask Questions

Eyettention: An Attention-based Dual-Sequence Model for Predicting Human Scanpaths during Reading

Apr 21, 2023

Shuwen Deng, David R. Reich, Paul Prasse, Patrick Haller, Tobias Scheffer, Lena A. Jäger

Abstract:Eye movements during reading offer insights into both the reader's cognitive processes and the characteristics of the text that is being read. Hence, the analysis of scanpaths in reading have attracted increasing attention across fields, ranging from cognitive science over linguistics to computer science. In particular, eye-tracking-while-reading data has been argued to bear the potential to make machine-learning-based language models exhibit a more human-like linguistic behavior. However, one of the main challenges in modeling human scanpaths in reading is their dual-sequence nature: the words are ordered following the grammatical rules of the language, whereas the fixations are chronologically ordered. As humans do not strictly read from left-to-right, but rather skip or refixate words and regress to previous words, the alignment of the linguistic and the temporal sequence is non-trivial. In this paper, we develop Eyettention, the first dual-sequence model that simultaneously processes the sequence of words and the chronological sequence of fixations. The alignment of the two sequences is achieved by a cross-sequence attention mechanism. We show that Eyettention outperforms state-of-the-art models in predicting scanpaths. We provide an extensive within- and across-data set evaluation on different languages. An ablation study and qualitative analysis support an in-depth understanding of the model's behavior.

Via

Access Paper or Ask Questions

Bridging the Gap: Gaze Events as Interpretable Concepts to Explain Deep Neural Sequence Models

Apr 12, 2023

Daniel G. Krakowczyk, Paul Prasse, David R. Reich, Sebastian Lapuschkin, Tobias Scheffer, Lena A. Jäger

Abstract:Recent work in XAI for eye tracking data has evaluated the suitability of feature attribution methods to explain the output of deep neural sequence models for the task of oculomotric biometric identification. These methods provide saliency maps to highlight important input features of a specific eye gaze sequence. However, to date, its localization analysis has been lacking a quantitative approach across entire datasets. In this work, we employ established gaze event detection algorithms for fixations and saccades and quantitatively evaluate the impact of these events by determining their concept influence. Input features that belong to saccades are shown to be substantially more important than features that belong to fixations. By dissecting saccade events into sub-events, we are able to show that gaze samples that are close to the saccadic peak velocity are most influential. We further investigate the effect of event properties like saccadic amplitude or fixational dispersion on the resulting concept influence.

* Preprint for ETRA '23: 2023 Symposium on Eye Tracking Research and Applications

Via

Access Paper or Ask Questions

Detection of ADHD based on Eye Movements during Natural Viewing

Jul 14, 2022

Shuwen Deng, Paul Prasse, David R. Reich, Sabine Dziemian, Maja Stegenwallner-Schütz, Daniel Krakowczyk, Silvia Makowski, Nicolas Langer, Tobias Scheffer, Lena A. Jäger

Figure 1 for Detection of ADHD based on Eye Movements during Natural Viewing

Figure 2 for Detection of ADHD based on Eye Movements during Natural Viewing

Figure 3 for Detection of ADHD based on Eye Movements during Natural Viewing

Figure 4 for Detection of ADHD based on Eye Movements during Natural Viewing

Abstract:Attention-deficit/hyperactivity disorder (ADHD) is a neurodevelopmental disorder that is highly prevalent and requires clinical specialists to diagnose. It is known that an individual's viewing behavior, reflected in their eye movements, is directly related to attentional mechanisms and higher-order cognitive processes. We therefore explore whether ADHD can be detected based on recorded eye movements together with information about the video stimulus in a free-viewing task. To this end, we develop an end-to-end deep learning-based sequence model which we pre-train on a related task for which more data are available. We find that the method is in fact able to detect ADHD and outperforms relevant baselines. We investigate the relevance of the input features in an ablation study. Interestingly, we find that the model's performance is closely related to the content of the video, which provides insights for future experimental designs.

* Pre-print for Proceedings of the European Conference on Machine Learning, 2022

Via

Access Paper or Ask Questions