Sissi
Abstract:Multimodal Large Language Models (MLLMs) are increasingly used to interpret visualizations, yet little is known about why they fail. We present the first systematic analysis of barriers to visualization literacy in MLLMs. Using the regenerated Visualization Literacy Assessment Test (reVLAT) benchmark with synthetic data, we open-coded 309 erroneous responses from four state-of-the-art models with a barrier-centric strategy adapted from human visualization literacy research. Our analysis yields a taxonomy of MLLM failures, revealing two machine-specific barriers that extend prior human-participation frameworks. Results show that models perform well on simple charts but struggle with color-intensive, segment-based visualizations, often failing to form consistent comparative reasoning. Our findings inform future evaluation and design of reliable AI-driven visualization assistants.
Abstract:Alzheimer's Disease (AD) causes a continuous decline in memory, thinking, and judgment. Traditional diagnoses are usually based on clinical experience, which is limited by some realistic factors. In this paper, we focus on exploiting deep learning techniques to diagnose AD based on eye-tracking behaviors. Visual attention, as typical eye-tracking behavior, is of great clinical value to detect cognitive abnormalities in AD patients. To better analyze the differences in visual attention between AD patients and normals, we first conduct a 3D comprehensive visual task on a non-invasive eye-tracking system to collect visual attention heatmaps. We then propose a multi-layered comparison convolution neural network (MC-CNN) to distinguish the visual attention differences between AD patients and normals. In MC-CNN, the multi-layered representations of heatmaps are obtained by hierarchical convolution to better encode eye-movement behaviors, which are further integrated into a distance vector to benefit the comprehensive visual task. Extensive experimental results on the collected dataset demonstrate that MC-CNN achieves consistent validity in classifying AD patients and normals with eye-tracking data.