Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Reynold Bailey

Using Deep Learning to Increase Eye-Tracking Robustness, Accuracy, and Precision in Virtual Reality

Mar 28, 2024

Kevin Barkevich, Reynold Bailey, Gabriel J. Diaz

Figure 1 for Using Deep Learning to Increase Eye-Tracking Robustness, Accuracy, and Precision in Virtual Reality

Figure 2 for Using Deep Learning to Increase Eye-Tracking Robustness, Accuracy, and Precision in Virtual Reality

Figure 3 for Using Deep Learning to Increase Eye-Tracking Robustness, Accuracy, and Precision in Virtual Reality

Figure 4 for Using Deep Learning to Increase Eye-Tracking Robustness, Accuracy, and Precision in Virtual Reality

Abstract:Algorithms for the estimation of gaze direction from mobile and video-based eye trackers typically involve tracking a feature of the eye that moves through the eye camera image in a way that covaries with the shifting gaze direction, such as the center or boundaries of the pupil. Tracking these features using traditional computer vision techniques can be difficult due to partial occlusion and environmental reflections. Although recent efforts to use machine learning (ML) for pupil tracking have demonstrated superior results when evaluated using standard measures of segmentation performance, little is known of how these networks may affect the quality of the final gaze estimate. This work provides an objective assessment of the impact of several contemporary ML-based methods for eye feature tracking when the subsequent gaze estimate is produced using either feature-based or model-based methods. Metrics include the accuracy and precision of the gaze estimate, as well as drop-out rate.

* 16 pages, 10 figures, accepted to ETRA 2024 Full Papers

Via

Access Paper or Ask Questions

Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems

Mar 23, 2024

Viet Dung Nguyen, Reynold Bailey, Gabriel J. Diaz, Chengyi Ma, Alexander Fix, Alexander Ororbia

Abstract:Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate. Segmentation models trained using supervised machine learning can excel at this task, their effectiveness is determined by the degree of overlap between the narrow distributions of image properties defined by the target dataset and highly specific training datasets, of which there are few. Attempts to broaden the distribution of existing eye image datasets through the inclusion of synthetic eye images have found that a model trained on synthetic images will often fail to generalize back to real-world eye images. In remedy, we use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data, and to prune the training dataset in a manner that maximizes distribution overlap. We demonstrate that our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.

* 14 pages, 8 figures, accepted to ETRA 2024

Via

Access Paper or Ask Questions

A Neural Active Inference Model of Perceptual-Motor Learning

Nov 16, 2022

Zhizhuo Yang, Gabriel J. Diaz, Brett R. Fajen, Reynold Bailey, Alexander Ororbia

Figure 1 for A Neural Active Inference Model of Perceptual-Motor Learning

Figure 2 for A Neural Active Inference Model of Perceptual-Motor Learning

Figure 3 for A Neural Active Inference Model of Perceptual-Motor Learning

Figure 4 for A Neural Active Inference Model of Perceptual-Motor Learning

Abstract:The active inference framework (AIF) is a promising new computational framework grounded in contemporary neuroscience that can produce human-like behavior through reward-based learning. In this study, we test the ability for the AIF to capture the role of anticipation in the visual guidance of action in humans through the systematic investigation of a visual-motor task that has been well-explored -- that of intercepting a target moving over a ground plane. Previous research demonstrated that humans performing this task resorted to anticipatory changes in speed intended to compensate for semi-predictable changes in target speed later in the approach. To capture this behavior, our proposed "neural" AIF agent uses artificial neural networks to select actions on the basis of a very short term prediction of the information about the task environment that these actions would reveal along with a long-term estimate of the resulting cumulative expected free energy. Systematic variation revealed that anticipatory behavior emerged only when required by limitations on the agent's movement capabilities, and only when the agent was able to estimate accumulated free energy over sufficiently long durations into the future. In addition, we present a novel formulation of the prior function that maps a multi-dimensional world-state to a uni-dimensional distribution of free-energy. Together, these results demonstrate the use of AIF as a plausible model of anticipatory visually guided behavior in humans.

* 16 pages including references, 6 figures. Submitted to Frontiers in Computational Neuroscience

Via

Access Paper or Ask Questions

RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking

Oct 01, 2019

Aayush K. Chaudhary, Rakshit Kothari, Manoj Acharya, Shusil Dangi, Nitinraj Nair, Reynold Bailey, Christopher Kanan, Gabriel Diaz, Jeff B. Pelz

Figure 1 for RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking

Figure 2 for RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking

Figure 3 for RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking

Figure 4 for RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking

Abstract:Accurate eye segmentation can improve eye-gaze estimation and support interactive computing based on visual attention; however, existing eye segmentation methods suffer from issues such as person-dependent accuracy, lack of robustness, and an inability to be run in real-time. Here, we present the RITnet model, which is a deep neural network that combines U-Net and DenseNet. RITnet is under 1 MB and achieves 95.3\% accuracy on the 2019 OpenEDS Semantic Segmentation challenge. Using a GeForce GTX 1080 Ti, RITnet tracks at $>$ 300Hz, enabling real-time gaze tracking applications. Pre-trained models and source code are available https://bitbucket.org/eye-ush/ritnet/.

* This model is the winning submission for OpenEDS Semantic Segmentation Challenge for Eye images https://research.fb.com/programs/openeds-challenge/. To appear in ICCVW 2019. ("Pre-trained models and source code are available https://bitbucket.org/eye-ush/ritnet/.")

Via

Access Paper or Ask Questions

Gaze-in-wild: A dataset for studying eye and head coordination in everyday activities

May 09, 2019

Rakshit Kothari, Zhizhuo Yang, Christopher Kanan, Reynold Bailey, Jeff Pelz, Gabriel Diaz

Figure 1 for Gaze-in-wild: A dataset for studying eye and head coordination in everyday activities

Figure 2 for Gaze-in-wild: A dataset for studying eye and head coordination in everyday activities

Figure 3 for Gaze-in-wild: A dataset for studying eye and head coordination in everyday activities

Figure 4 for Gaze-in-wild: A dataset for studying eye and head coordination in everyday activities

Abstract:The interaction between the vestibular and ocular system has primarily been studied in controlled environments. Consequently, off-the shelf tools for categorization of gaze events (e.g. fixations, pursuits, saccade) fail when head movements are allowed. Our approach was to collect a novel, naturalistic, and multimodal dataset of eye+head movements when subjects performed everyday tasks while wearing a mobile eye tracker equipped with an inertial measurement unit and a 3D stereo camera. This Gaze-in-the-Wild dataset (GW) includes eye+head rotational velocities (deg/s), infrared eye images and scene imagery (RGB+D). A portion was labelled by coders into gaze motion events with a mutual agreement of 0.72 sample based Cohen's $\kappa$. This labelled data was used to train and evaluate two machine learning algorithms, Random Forest and a Recurrent Neural Network model, for gaze event classification. Assessment involved the application of established and novel event based performance metrics. Classifiers achieve $\sim$90$\%$ human performance in detecting fixations and saccades but fall short (60$\%$) on detecting pursuit movements. Moreover, pursuit classification is far worse in the absence of head movement information. A subsequent analysis of feature significance in our best-performing model revealed a reliance upon absolute eye and head velocity, indicating that classification does not require spatial alignment of the head and eye tracking coordinate systems. The GW dataset, trained classifiers and evaluation metrics will be made publicly available with the intention of facilitating growth in the emerging area of head-free gaze event classification.

* 23 pages, 11 figures, 10 tables, Dataset can be found at http://www.cis.rit.edu/~rsk3900/

Via

Access Paper or Ask Questions

Differential Privacy for Eye-Tracking Data

Apr 15, 2019

Ao Liu, Lirong Xia, Andrew Duchowski, Reynold Bailey, Kenneth Holmqvist, Eakta Jain

Figure 1 for Differential Privacy for Eye-Tracking Data

Figure 2 for Differential Privacy for Eye-Tracking Data

Figure 3 for Differential Privacy for Eye-Tracking Data

Figure 4 for Differential Privacy for Eye-Tracking Data

Abstract:As large eye-tracking datasets are created, data privacy is a pressing concern for the eye-tracking community. De-identifying data does not guarantee privacy because multiple datasets can be linked for inferences. A common belief is that aggregating individuals' data into composite representations such as heatmaps protects the individual. However, we analytically examine the privacy of (noise-free) heatmaps and show that they do not guarantee privacy. We further propose two noise mechanisms that guarantee privacy and analyze their privacy-utility tradeoff. Analysis reveals that our Gaussian noise mechanism is an elegant solution to preserve privacy for heatmaps. Our results have implications for interdisciplinary research to create differentially private mechanisms for eye tracking.

* 10 pages including appendix

Via

Access Paper or Ask Questions