Abstract:Eye movement prediction is a promising area of research to compensate for the latency introduced by eye-tracking systems in virtual reality devices. In this study, we comprehensively analyze the complexity of the eye movement prediction task associated with subjects. We use three fundamentally different models within the analysis: the lightweight Long Short-Term Memory network (LSTM), the transformer-based network for multivariate time series representation learning (TST), and the Oculomotor Plant Mathematical Model wrapped in the Kalman Filter framework (OPKF). Each solution is assessed following a sample-to-event evaluation strategy and employing the new event-to-subject metrics. Our results show that the different models maintained similar prediction performance trends pertaining to subjects. We refer to these outcomes as per-subject complexity since some subjects' data pose a more significant challenge for models. Along with the detailed correlation analysis, this report investigates the source of the per-subject complexity and discusses potential solutions to overcome it.
Abstract:This paper performs the crucial work of establishing a baseline for gaze-driven authentication performance to begin answering fundamental research questions using a very large dataset of gaze recordings from 9202 people with a level of eye tracking (ET) signal quality equivalent to modern consumer-facing virtual reality (VR) platforms. The size of the employed dataset is at least an order-of-magnitude larger than any other dataset from previous related work. Binocular estimates of the optical and visual axes of the eyes and a minimum duration for enrollment and verification are required for our model to achieve a false rejection rate (FRR) of below 3% at a false acceptance rate (FAR) of 1 in 50,000. In terms of identification accuracy which decreases with gallery size, we estimate that our model would fall below chance-level accuracy for gallery sizes of 148,000 or more. Our major findings indicate that gaze authentication can be as accurate as required by the FIDO standard when driven by a state-of-the-art machine learning architecture and a sufficiently large training dataset.
Abstract:In this work, we tackle the problem of ternary eye movement classification, which aims to separate fixations, saccades and smooth pursuits from the raw eye positional data. The efficient classification of these different types of eye movements helps to better analyze and utilize the eye tracking data. Different from the existing methods that detect eye movement by several pre-defined threshold values, we propose a hierarchical Hidden Markov Model (HMM) statistical algorithm for detecting fixations, saccades and smooth pursuits. The proposed algorithm leverages different features from the recorded raw eye tracking data with a hierarchical classification strategy, separating one type of eye movement each time. Experimental results demonstrate the effectiveness and robustness of the proposed method by achieving competitive or better performance compared to the state-of-the-art methods.
Abstract:The proper classification of major eye movements, saccades, fixations, and smooth pursuits, remains essential to utilizing eye-tracking data. There is difficulty in separating out smooth pursuits from the other behavior types, particularly from fixations. To this end, we propose a new offline algorithm, I-VDT-HMM, for tertiary classification of eye movements. The algorithm combines the simplicity of two foundational algorithms, I-VT and I-DT, as has been implemented in I-VDT, with the statistical predictive power of the Viterbi algorithm. We evaluate the fitness across a dataset of eight eye movement records at eight sampling rates gathered from previous research, with a comparison to the current state-of-the-art using the proposed quantitative and qualitative behavioral scores. The proposed algorithm achieves promising results in clean high sampling frequency data and with slight modifications could show similar results with lower quality data. Though, the statistical aspect of the algorithm comes at a cost of classification time.
Abstract:We built a custom video-based eye-tracker that saves every video frame as a full resolution image (MJPEG). Images can be processed offline for the detection of ocular features, including the pupil and corneal reflection (First Purkinje Image, P1) position. A comparison of multiple algorithms for detection of pupil and corneal reflection can be performed. The system provides for highly flexible stimulus creation, with mixing of graphic, image, and video stimuli. We can change cameras and infrared illuminators depending on the image qualities and frame rate desired. Using this system, we have detected the position of the Fourth Purkinje image (P4) in the frames. We show that when we estimate gaze by calculating P1-P4, signal compares well with gaze estimated with a DPI eye-tracker, which natively detects and tracks the P1 and P4.
Abstract:We create synthetic biometric databases to study general, fundamental, biometric principles. First, we check the validity of the synthetic database design by comparing it to real data in terms of biometric performance. The real data used for this validity check was from an eye-movement related biometric database. Next, we employ our database to evaluate the impact of variations of temporal persistence of features on biometric performance. We index temporal persistence with the intraclass correlation coefficient (ICC). We find that variations in temporal persistence are extremely highly correlated with variations in biometric performance. Finally, we use our synthetic database strategy to determine how many features are required to achieve particular levels of performance as the number of subjects in the database increases from 100 to 10,000. An important finding is that the number of features required to achieve various EER values (2%, 0.3%, 0.15%) is essentially constant in the database sizes that we studied. We hypothesize that the insights obtained from our study would be applicable to many biometric modalities where extracted feature properties resemble the properties of the synthetic features we discuss in this work.
Abstract:This work presents a study on the extraction and analysis of a set of 101 categories of eye movement features from three types of eye movement events: fixations, saccades, and post-saccadic oscillations. The eye movements were recorded during a reading task. For the categories of features with multiple instances in a recording we extract corresponding feature subtypes by calculating descriptive statistics on the distributions of these instances. A unified framework of detailed descriptions and mathematical formulas are provided for the extraction of the feature set. The analysis of feature values is performed using a large database of eye movement recordings from a normative population of 298 subjects. We demonstrate the central tendency and overall variability of feature values over the experimental population, and more importantly, we quantify the test-retest reliability (repeatability) of each separate feature. The described methods and analysis can provide valuable tools in fields exploring the eye movements, such as in behavioral studies, attention and cognition research, medical research, biometric recognition, and human-computer interaction.