Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Nov 25, 2024

Xuweiyi Chen, Markus Marks, Zezhou Cheng

Figure 1 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Figure 2 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Figure 3 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Figure 4 for Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Share this with someone who'll enjoy it:

Abstract:Mid-level vision capabilities - such as generic object localization and 3D geometric understanding - are not only fundamental to human vision but are also crucial for many real-world applications of computer vision. These abilities emerge with minimal supervision during the early stages of human visual development. Despite their significance, current self-supervised learning (SSL) approaches are primarily designed and evaluated for high-level recognition tasks, leaving their mid-level vision capabilities largely unexamined. In this study, we introduce a suite of benchmark protocols to systematically assess mid-level vision capabilities and present a comprehensive, controlled evaluation of 22 prominent SSL models across 8 mid-level vision tasks. Our experiments reveal a weak correlation between mid-level and high-level task performance. We also identify several SSL methods with highly imbalanced performance across mid-level and high-level capabilities, as well as some that excel in both. Additionally, we investigate key factors contributing to mid-level vision performance, such as pretraining objectives and network architectures. Our study provides a holistic and timely view of what SSL models have learned, complementing existing research that primarily focuses on high-level vision tasks. We hope our findings guide future SSL research to benchmark models not only on high-level vision tasks but on mid-level as well.

* Project Page: https://midvision-probe.cs.virginia.edu/

View paper on

Share this with someone who'll enjoy it:

Title:Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Paper and Code