Abstract:Remote sensing target detection aims to identify and locate critical targets within remote sensing images, finding extensive applications in agriculture and urban planning. Feature pyramid networks (FPNs) are commonly used to extract multi-scale features. However, existing FPNs often overlook extracting low-level positional information and fine-grained context interaction. To address this, we propose a novel location refined feature pyramid network (LR-FPN) to enhance the extraction of shallow positional information and facilitate fine-grained context interaction. The LR-FPN consists of two primary modules: the shallow position information extraction module (SPIEM) and the contextual interaction module (CIM). Specifically, SPIEM first maximizes the retention of solid location information of the target by simultaneously extracting positional and saliency information from the low-level feature map. Subsequently, CIM injects this robust location information into different layers of the original FPN through spatial and channel interaction, explicitly enhancing the object area. Moreover, in spatial interaction, we introduce a simple local and non-local interaction strategy to learn and retain the saliency information of the object. Lastly, the LR-FPN can be readily integrated into common object detection frameworks to improve performance significantly. Extensive experiments on two large-scale remote sensing datasets (i.e., DOTAV1.0 and HRSC2016) demonstrate that the proposed LR-FPN is superior to state-of-the-art object detection approaches. Our code and models will be publicly available.
Abstract:Object re-identification method is made up of backbone network, feature aggregation, and loss function. However, most backbone networks lack a special mechanism to handle rich scale variations and mine discriminative feature representations. In this paper, we firstly design a hierarchical similarity graph module (HSGM) to reduce the conflict of backbone and re-identification networks. The designed HSGM builds a rich hierarchical graph to mine the mapping relationships between global-local and local-local. Secondly, we divide the feature map along with the spatial and channel directions in each hierarchical graph. The HSGM applies the spatial features and channel features extracted from different locations as nodes, respectively, and utilizes the similarity scores between nodes to construct spatial and channel similarity graphs. During the learning process of HSGM, we utilize a learnable parameter to re-optimize the importance of each position, as well as evaluate the correlation between different nodes. Thirdly, we develop a novel hierarchical similarity graph network (HSGNet) by embedding the HSGM in the backbone network. Furthermore, HSGM can be easily embedded into backbone networks of any depth to improve object re-identification ability. Finally, extensive experiments on three large-scale object datasets demonstrate that the proposed HSGNet is superior to state-of-the-art object re-identification approaches.