Abstract:Single-frame infrared small target (SIRST) detection poses a significant challenge due to the requirement to discern minute targets amidst complex infrared background clutter. Recently, deep learning approaches have shown promising results in this domain. However, these methods heavily rely on extensive manual annotations, which are particularly cumbersome and resource-intensive for infrared small targets owing to their minute sizes. To address this limitation, we introduce a Hybrid Mask Generation (HMG) approach that recovers high-quality masks for each target from only a single-point label for network training. Specifically, our HMG approach consists of a handcrafted Points-to-Mask Generation strategy coupled with a pseudo mask updating strategy to recover and refine pseudo masks from point labels. The Points-to-Mask Generation strategy divides two distinct stages: Points-to-Box conversion, where individual point labels are transformed into bounding boxes, and subsequently, Box-to-Mask prediction, where these bounding boxes are elaborated into precise masks. The mask updating strategy integrates the complementary strengths of handcrafted and deep-learning algorithms to iteratively refine the initial pseudo masks. Experimental results across three datasets demonstrate that our method outperforms the existing methods for infrared small target detection with single-point supervision.
Abstract:Few-Shot Segmentation (FSS) aims to segment the novel class images with a few annotated samples. In this paper, we propose a dense affinity matching (DAM) framework to exploit the support-query interaction by densely capturing both the pixel-to-pixel and pixel-to-patch relations in each support-query pair with the bidirectional 3D convolutions. Different from the existing methods that remove the support background, we design a hysteretic spatial filtering module (HSFM) to filter the background-related query features and retain the foreground-related query features with the assistance of the support background, which is beneficial for eliminating interference objects in the query background. We comprehensively evaluate our DAM on ten benchmarks under cross-category, cross-dataset, and cross-domain FSS tasks. Experimental results demonstrate that DAM performs very competitively under different settings with only 0.68M parameters, especially under cross-domain FSS tasks, showing its effectiveness and efficiency.
Abstract:Few-Shot Segmentation (FSS) is challenging for limited support images and large intra-class appearance discrepancies. Due to the huge difference between support and query samples, most existing approaches focus on extracting high-level representations of the same layers for support-query correlations but neglect the shift issue between different layers and scales. In this paper, we propose a Multi-Context Interaction Network (MCINet) to remedy this issue by fully exploiting and interacting with the multi-scale contextual information contained in the support-query pairs. Specifically, MCINet improves FSS from the perspectives of boosting the query representations by incorporating the low-level structural information from another query branch into the high-level semantic features, enhancing the support-query correlations by exploiting both the same-layer and adjacent-layer features, and refining the predicted results by a multi-scale mask prediction strategy, with which the different scale contents have bidirectionally interacted. Experiments on two benchmarks demonstrate that our approach reaches SOTA performances and outperforms the best competitors with many desirable advantages, especially on the challenging COCO dataset.