Abstract:Few-shot segmentation (FSS) for remote sensing (RS) imagery leverages supporting information from limited annotated samples to achieve query segmentation of novel classes. Previous efforts are dedicated to mining segmentation-guiding visual cues from a constrained set of support samples. However, they still struggle to address the pronounced intra-class differences in RS images, as sparse visual cues make it challenging to establish robust class-specific representations. In this paper, we propose a holistic semantic embedding (HSE) approach that effectively harnesses general semantic knowledge, i.e., class description (CD) embeddings.Instead of the naive combination of CD embeddings and visual features for segmentation decoding, we investigate embedding the general semantic knowledge during the feature extraction stage.Specifically, in HSE, a spatial dense interaction module allows the interaction of visual support features with CD embeddings along the spatial dimension via self-attention.Furthermore, a global content modulation module efficiently augments the global information of the target category in both support and query features, thanks to the transformative fusion of visual features and CD embeddings.These two components holistically synergize general CD embeddings and visual cues, constructing a robust class-specific representation.Through extensive experiments on the standard FSS benchmark, the proposed HSE approach demonstrates superior performance compared to peer work, setting a new state-of-the-art.
Abstract:Few-shot learning aims to generalize the recognizer from seen categories to an entirely novel scenario. With only a few support samples, several advanced methods initially introduce class names as prior knowledge for identifying novel classes. However, obstacles still impede achieving a comprehensive understanding of how to harness the mutual advantages of visual and textual knowledge. In this paper, we propose a coherent Bidirectional Knowledge Permeation strategy called BiKop, which is grounded in a human intuition: A class name description offers a general representation, whereas an image captures the specificity of individuals. BiKop primarily establishes a hierarchical joint general-specific representation through bidirectional knowledge permeation. On the other hand, considering the bias of joint representation towards the base set, we disentangle base-class-relevant semantics during training, thereby alleviating the suppression of potential novel-class-relevant information. Experiments on four challenging benchmarks demonstrate the remarkable superiority of BiKop. Our code will be publicly available.
Abstract:Noncontact particle manipulation (NPM) technology has significantly extended mankind's analysis capability into micro and nano scale, which in turn greatly promoted the development of material science and life science. Though NPM by means of electric, magnetic, and optical field has achieved great success, from the robotic perspective, it is still labor-intensive manipulation since professional human assistance is somehow mandatory in early preparation stage. Therefore, developing automated noncontact trapping of moving particles is worthwhile, particularly for applications where particle samples are rare, fragile or contact sensitive. Taking advantage of latest dynamic acoustic field modulating technology, and particularly by virtue of the great scalability of acoustic manipulation from micro-scale to sub-centimeter-scale, we propose an automated noncontact trapping of moving micro-particles with ultrasonic phased array system and microscopic vision in this paper. The main contribution of this work is for the first time, as far as we know, we achieved fully automated moving micro-particle trapping in acoustic NPM field by resorting to robotic approach. In short, the particle moving status is observed and predicted by binocular microscopic vision system, by referring to which the acoustic trapping zone is calculated and generated to capture and stably hold the particle. The problem of hand-eye relationship of noncontact robotic end-effector is also solved in this work. Experiments demonstrated the effectiveness of this work.