Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Weizeng Lu

Selective Multi-Scale Learning for Object Detection

Jun 16, 2022

Junliang Chen, Weizeng Lu, Linlin Shen

Abstract:Pyramidal networks are standard methods for multi-scale object detection. Current researches on feature pyramid networks usually adopt layer connections to collect features from certain levels of the feature hierarchy, and do not consider the significant differences among them. We propose a better architecture of feature pyramid networks, named selective multi-scale learning (SMSL), to address this issue. SMSL is efficient and general, which can be integrated in both single-stage and two-stage detectors to boost detection performance, with nearly no extra inference cost. RetinaNet combined with SMSL obtains 1.8\% improvement in AP (from 39.1\% to 40.9\%) on COCO dataset. When integrated with SMSL, two-stage detectors can get around 1.0\% improvement in AP.

* Accepted by ICANN2021

Via

Access Paper or Ask Questions

Online Refinement of Low-level Feature Based Activation Map for Weakly Supervised Object Localization

Oct 12, 2021

Jinheng Xie, Cheng Luo, Xiangping Zhu, Ziqi Jin, Weizeng Lu, Linlin Shen

Figure 1 for Online Refinement of Low-level Feature Based Activation Map for Weakly Supervised Object Localization

Figure 2 for Online Refinement of Low-level Feature Based Activation Map for Weakly Supervised Object Localization

Figure 3 for Online Refinement of Low-level Feature Based Activation Map for Weakly Supervised Object Localization

Figure 4 for Online Refinement of Low-level Feature Based Activation Map for Weakly Supervised Object Localization

Abstract:We present a two-stage learning framework for weakly supervised object localization (WSOL). While most previous efforts rely on high-level feature based CAMs (Class Activation Maps), this paper proposes to localize objects using the low-level feature based activation maps. In the first stage, an activation map generator produces activation maps based on the low-level feature maps in the classifier, such that rich contextual object information is included in an online manner. In the second stage, we employ an evaluator to evaluate the activation maps predicted by the activation map generator. Based on this, we further propose a weighted entropy loss, an attentive erasing, and an area loss to drive the activation map generator to substantially reduce the uncertainty of activations between object and background, and explore less discriminative regions. Based on the low-level object information preserved in the first stage, the second stage model gradually generates a well-separated, complete, and compact activation map of object in the image, which can be easily thresholded for accurate localization. Extensive experiments on CUB-200-2011 and ImageNet-1K datasets show that our framework surpasses previous methods by a large margin, which sets a new state-of-the-art for WSOL.

* Accepted to ICCV 2021.(corrected some minor mistakes)

Via

Access Paper or Ask Questions

Geometry Constrained Weakly Supervised Object Localization

Jul 19, 2020

Weizeng Lu, Xi Jia, Weicheng Xie, Linlin Shen, Yicong Zhou, Jinming Duan

Figure 1 for Geometry Constrained Weakly Supervised Object Localization

Figure 2 for Geometry Constrained Weakly Supervised Object Localization

Figure 3 for Geometry Constrained Weakly Supervised Object Localization

Figure 4 for Geometry Constrained Weakly Supervised Object Localization

Abstract:We propose a geometry constrained network, termed GC-Net, for weakly supervised object localization (WSOL). GC-Net consists of three modules: a detector, a generator and a classifier. The detector predicts the object location defined by a set of coefficients describing a geometric shape (i.e. ellipse or rectangle), which is geometrically constrained by the mask produced by the generator. The classifier takes the resulting masked images as input and performs two complementary classification tasks for the object and background. To make the mask more compact and more complete, we propose a novel multi-task loss function that takes into account area of the geometric shape, the categorical cross-entropy and the negative entropy. In contrast to previous approaches, GC-Net is trained end-to-end and predict object location without any post-processing (e.g. thresholding) that may require additional tuning. Extensive experiments on the CUB-200-2011 and ILSVRC2012 datasets show that GC-Net outperforms state-of-the-art methods by a large margin. Our source code is available at https://github.com/lwzeng/GC-Net.

* This paper (ID 5424) is accepted to ECCV 2020

Via

Access Paper or Ask Questions