Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Katherine R. Storrs

Visual Language Models show widespread visual deficits on neuropsychological tests

Apr 16, 2025

Gene Tangtartharakul, Katherine R. Storrs

Abstract:Visual Language Models (VLMs) show remarkable performance in visual reasoning tasks, successfully tackling college-level challenges that require high-level understanding of images. However, some recent reports of VLMs struggling to reason about elemental visual concepts like orientation, position, continuity, and occlusion suggest a potential gulf between human and VLM vision. Here we use the toolkit of neuropsychology to systematically assess the capabilities of three state-of-the-art VLMs across visual domains. Using 51 tests drawn from six clinical and experimental batteries, we characterise the visual abilities of leading VLMs relative to normative performance in healthy adults. While the models excel in straightforward object recognition tasks, we find widespread deficits in low- and mid-level visual abilities that would be considered clinically significant in humans. These selective deficits, profiled through validated test batteries, suggest that an artificial system can achieve complex object recognition without developing foundational visual concepts that in humans require no explicit training.

* 31 pages, 3 figures, 1 supplementary document with 1 figure and 51 sample images; corrected typo in Fig 1

Via

Access Paper or Ask Questions

Predicting Perceived Gloss: Do Weak Labels Suffice?

Mar 26, 2024

Julia Guerrero-Viu, J. Daniel Subias, Ana Serrano, Katherine R. Storrs, Roland W. Fleming, Belen Masia, Diego Gutierrez

Abstract:Estimating perceptual attributes of materials directly from images is a challenging task due to their complex, not fully-understood interactions with external factors, such as geometry and lighting. Supervised deep learning models have recently been shown to outperform traditional approaches, but rely on large datasets of human-annotated images for accurate perception predictions. Obtaining reliable annotations is a costly endeavor, aggravated by the limited ability of these models to generalise to different aspects of appearance. In this work, we show how a much smaller set of human annotations ("strong labels") can be effectively augmented with automatically derived "weak labels" in the context of learning a low-dimensional image-computable gloss metric. We evaluate three alternative weak labels for predicting human gloss perception from limited annotated data. Incorporating weak labels enhances our gloss prediction beyond the current state of the art. Moreover, it enables a substantial reduction in human annotation costs without sacrificing accuracy, whether working with rendered images or real photographs.

* Computer Graphics Forum (Eurographics 2024)

Via

Access Paper or Ask Questions

Deep Learning for Cognitive Neuroscience

Mar 04, 2019

Katherine R. Storrs, Nikolaus Kriegeskorte

Figure 1 for Deep Learning for Cognitive Neuroscience

Figure 2 for Deep Learning for Cognitive Neuroscience

Abstract:Neural network models can now recognise images, understand text, translate languages, and play many human games at human or superhuman levels. These systems are highly abstracted, but are inspired by biological brains and use only biologically plausible computations. In the coming years, neural networks are likely to become less reliant on learning from massive labelled datasets, and more robust and generalisable in their task performance. From their successes and failures, we can learn about the computational requirements of the different tasks at which brains excel. Deep learning also provides the tools for testing cognitive theories. In order to test a theory, we need to realise the proposed information-processing system at scale, so as to be able to assess its feasibility and emergent behaviours. Deep learning allows us to scale up from principles and circuit models to end-to-end trainable models capable of performing complex tasks. There are many levels at which cognitive neuroscientists can use deep learning in their work, from inspiring theories to serving as full computational models. Ongoing advances in deep learning bring us closer to understanding how cognition and perception may be implemented in the brain -- the grand challenge at the core of cognitive neuroscience.

* Chapter to appear in The Cognitive Neurosciences, 6th Edition

Via

Access Paper or Ask Questions