Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Zoom is what you need: An empirical study of the power of zoom and spatial biases in image classification

Apr 11, 2023

Mohammad Reza Taesiri, Giang Nguyen, Sarra Habchi, Cor-Paul Bezemer, Anh Nguyen

Figure 1 for Zoom is what you need: An empirical study of the power of zoom and spatial biases in image classification

Figure 2 for Zoom is what you need: An empirical study of the power of zoom and spatial biases in image classification

Figure 3 for Zoom is what you need: An empirical study of the power of zoom and spatial biases in image classification

Figure 4 for Zoom is what you need: An empirical study of the power of zoom and spatial biases in image classification

Share this with someone who'll enjoy it:

Abstract:Image classifiers are information-discarding machines, by design. Yet, how these models discard information remains mysterious. We hypothesize that one way for image classifiers to reach high accuracy is to first zoom to the most discriminative region in the image and then extract features from there to predict image labels. We study six popular networks ranging from AlexNet to CLIP and find that proper framing of the input image can lead to the correct classification of 98.91% of ImageNet images. Furthermore, we explore the potential and limits of zoom transforms in image classification and uncover positional biases in various datasets, especially a strong center bias in two popular datasets: ImageNet-A and ObjectNet. Finally, leveraging our insights into the potential of zoom, we propose a state-of-the-art test-time augmentation (TTA) technique that improves classification accuracy by forcing models to explicitly perform zoom-in operations before making predictions. Our method is more interpretable, accurate, and faster than MEMO, a state-of-the-art TTA method. Additionally, we propose ImageNet-Hard, a new benchmark where zooming in alone often does not help state-of-the-art models better label images.

View paper on

Share this with someone who'll enjoy it:

Title:Zoom is what you need: An empirical study of the power of zoom and spatial biases in image classification

Paper and Code