Abstract:Multiple benchmarks have been developed to assess the alignment between deep neural networks (DNNs) and human vision. In almost all cases these benchmarks are observational in the sense they are composed of behavioural and brain responses to naturalistic images that have not been manipulated to test hypotheses regarding how DNNs or humans perceive and identify objects. Here we introduce the toolbox MindSet: Vision, consisting of a collection of image datasets and related scripts designed to test DNNs on 30 psychological findings. In all experimental conditions, the stimuli are systematically manipulated to test specific hypotheses regarding human visual perception and object recognition. In addition to providing pre-generated datasets of images, we provide code to regenerate these datasets, offering many configurable parameters which greatly extend the dataset versatility for different research contexts, and code to facilitate the testing of DNNs on these image datasets using three different methods (similarity judgments, out-of-distribution classification, and decoder method), accessible at https://github.com/MindSetVision/mindset-vision. We test ResNet-152 on each of these methods as an example of how the toolbox can be used.
Abstract:The following is a dissertation aimed at understanding what the various phenomena in visual search teach us about the nature of human visual representations and processes. I first review some of the major empirical findings in the study of visual search. I next present a theory of visual search in terms of what I believe these findings suggest about the representations and processes underlying ventral visual processing. These principles are instantiated in a computational model called CASPER (Concurrent Attention: Serial and Parallel Evaluation with Relations), originally developed by Hummel, that I have adapted to account for a range of phenomena in visual search. I then describe an extension of the CASPER model to account for our ability to search for visual items defined not simply by the features composing those items but by the spatial relations among those features. Seven experiments (four main experiments and three replications) are described that test CASPER's predictions about relational search. Finally, I evaluate the fit between CASPER's predictions and the empirical findings and show with three additional simulations that CASPER can account for negative acceleration in search functions for relational stimuli if one postulates that the visual system is leveraging an emergent feature that bypasses relational processing.