Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection

Sep 25, 2024

Xu Han, Junyu Gao, Chuang Yang, Yuan Yuan, Qi Wang

Figure 1 for Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection

Figure 2 for Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection

Figure 3 for Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection

Figure 4 for Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection

Share this with someone who'll enjoy it:

Abstract:Due to the diversity of scene text in aspects such as font, color, shape, and size, accurately and efficiently detecting text is still a formidable challenge. Among the various detection approaches, segmentation-based approaches have emerged as prominent contenders owing to their flexible pixel-level predictions. However, these methods typically model text instances in a bottom-up manner, which is highly susceptible to noise. In addition, the prediction of pixels is isolated without introducing pixel-feature interaction, which also influences the detection performance. To alleviate these problems, we propose a multi-information level arbitrary-shaped text detector consisting of a focus entirety module (FEM) and a perceive environment module (PEM). The former extracts instance-level features and adopts a top-down scheme to model texts to reduce the influence of noises. Specifically, it assigns consistent entirety information to pixels within the same instance to improve their cohesion. In addition, it emphasizes the scale information, enabling the model to distinguish varying scale texts effectively. The latter extracts region-level information and encourages the model to focus on the distribution of positive samples in the vicinity of a pixel, which perceives environment information. It treats the kernel pixels as positive samples and helps the model differentiate text and kernel features. Extensive experiments demonstrate the FEM's ability to efficiently support the model in handling different scale texts and confirm the PEM can assist in perceiving pixels more accurately by focusing on pixel vicinities. Comparisons show the proposed model outperforms existing state-of-the-art approaches on four public datasets.

View paper on

Share this with someone who'll enjoy it:

Title:Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection

Paper and Code