Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Self-Supervised Learning for Visual Relationship Detection through Masked Bounding Box Reconstruction

Nov 08, 2023

Zacharias Anastasakis, Dimitrios Mallis, Markos Diomataris, George Alexandridis, Stefanos Kollias, Vassilis Pitsikalis

Figure 1 for Self-Supervised Learning for Visual Relationship Detection through Masked Bounding Box Reconstruction

Figure 2 for Self-Supervised Learning for Visual Relationship Detection through Masked Bounding Box Reconstruction

Figure 3 for Self-Supervised Learning for Visual Relationship Detection through Masked Bounding Box Reconstruction

Figure 4 for Self-Supervised Learning for Visual Relationship Detection through Masked Bounding Box Reconstruction

Share this with someone who'll enjoy it:

Abstract:We present a novel self-supervised approach for representation learning, particularly for the task of Visual Relationship Detection (VRD). Motivated by the effectiveness of Masked Image Modeling (MIM), we propose Masked Bounding Box Reconstruction (MBBR), a variation of MIM where a percentage of the entities/objects within a scene are masked and subsequently reconstructed based on the unmasked objects. The core idea is that, through object-level masked modeling, the network learns context-aware representations that capture the interaction of objects within a scene and thus are highly predictive of visual object relationships. We extensively evaluate learned representations, both qualitatively and quantitatively, in a few-shot setting and demonstrate the efficacy of MBBR for learning robust visual representations, particularly tailored for VRD. The proposed method is able to surpass state-of-the-art VRD methods on the Predicate Detection (PredDet) evaluation setting, using only a few annotated samples. We make our code available at https://github.com/deeplab-ai/SelfSupervisedVRD.

* Camera Ready paper version of WACV 2024

View paper on

Share this with someone who'll enjoy it:

Title:Self-Supervised Learning for Visual Relationship Detection through Masked Bounding Box Reconstruction

Paper and Code