Abstract:Interactive segmentation algorithms based on click points have garnered significant attention from researchers in recent years. However, existing studies typically use sparse click maps as model inputs to segment specific target objects, which primarily affect local regions and have limited abilities to focus on the whole target object, leading to increased times of clicks. In addition, most existing algorithms can not balance well between high performance and efficiency. To address this issue, we propose a click attention algorithm that expands the influence range of positive clicks based on the similarity between positively-clicked regions and the whole input. We also propose a discriminative affinity loss to reduce the attention coupling between positive and negative click regions to avoid an accuracy decrease caused by mutual interference between positive and negative clicks. Extensive experiments demonstrate that our approach is superior to existing methods and achieves cutting-edge performance in fewer parameters. An interactive demo and all reproducible codes will be released at https://github.com/hahamyt/ClickAttention.
Abstract:In the field of Industrial Informatics, interactive segmentation has gained significant attention for its application in human-computer interaction and data annotation. Existing algorithms, however, face challenges in balancing the segmentation accuracy between large and small targets, often leading to an increased number of user interactions. To tackle this, a novel multi-scale token adaptation algorithm, leveraging token similarity, has been devised to enhance segmentation across varying target sizes. This algorithm utilizes a differentiable top-k tokens selection mechanism, allowing for fewer tokens to be used while maintaining efficient multi-scale token interaction. Furthermore, a contrastive loss is introduced to better discriminate between target and background tokens, improving the correctness and robustness of the tokens similar to the target. Extensive benchmarking shows that the algorithm achieves state-of-the-art (SOTA) performance compared to current methods. An interactive demo and all reproducible codes will be released at https://github.com/hahamyt/mst.