Abstract:Deep neural networks perform well in object recognition, but do they perceive objects like humans? This study investigates the Gestalt principle of closure in convolutional neural networks. We propose a protocol to identify closure and conduct experiments using simple visual stimuli with progressively removed edge sections. We evaluate well-known networks on their ability to classify incomplete polygons. Our findings reveal a performance degradation as the edge removal percentage increases, indicating that current models heavily rely on complete edge information for accurate classification. The data used in our study is available on Github.
Abstract:Assessing the aesthetic quality of artistic images presents unique challenges due to the subjective nature of aesthetics and the complex visual characteristics inherent to artworks. Basic data augmentation techniques commonly applied to natural images in computer vision may not be suitable for art images in aesthetic evaluation tasks, as they can change the composition of the art images. In this paper, we explore the impact of local and global data augmentation techniques on artistic image aesthetic assessment (IAA). We introduce BackFlip, a local data augmentation technique designed specifically for artistic IAA. We evaluate the performance of BackFlip across three artistic image datasets and four neural network architectures, comparing it with the commonly used data augmentation techniques. Then, we analyze the effects of components within the BackFlip pipeline through an ablation study. Our findings demonstrate that local augmentations, such as BackFlip, tend to outperform global augmentations on artistic IAA in most cases, probably because they do not perturb the composition of the art images. These results emphasize the importance of considering both local and global augmentations in future computational aesthetics research.
Abstract:The human brain has an inherent ability to fill in gaps to perceive figures as complete wholes, even when parts are missing or fragmented. This phenomenon is known as Closure in psychology, one of the Gestalt laws of perceptual organization, explaining how the human brain interprets visual stimuli. Given the importance of Closure for human object recognition, we investigate whether neural networks rely on a similar mechanism. Exploring this crucial human visual skill in neural networks has the potential to highlight their comparability to humans. Recent studies have examined the Closure effect in neural networks. However, they typically focus on a limited selection of Convolutional Neural Networks (CNNs) and have not reached a consensus on their capability to perform Closure. To address these gaps, we present a systematic framework for investigating the Closure principle in neural networks. We introduce well-curated datasets designed to test for Closure effects, including both modal and amodal completion. We then conduct experiments on various CNNs employing different measurements. Our comprehensive analysis reveals that VGG16 and DenseNet-121 exhibit the Closure effect, while other CNNs show variable results. We interpret these findings by blending insights from psychology and neural network research, offering a unique perspective that enhances transparency in understanding neural networks. Our code and dataset will be made available on GitHub.