Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

Jul 10, 2022

Maoyuan Ye, Jing Zhang, Shanshan Zhao, Juhua Liu, Bo Du, Dacheng Tao

Figure 1 for DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

Figure 2 for DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

Figure 3 for DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

Figure 4 for DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

Share this with someone who'll enjoy it:

Abstract:Recently, Transformer-based methods, which predict polygon points or Bezier curve control points to localize texts, are quite popular in scene text detection. However, the used point label form implies the reading order of humans, which affects the robustness of Transformer model. As for the model architecture, the formulation of queries used in decoder has not been fully explored by previous methods. In this paper, we propose a concise dynamic point scene text detection Transformer network termed DPText-DETR, which directly uses point coordinates as queries and dynamically updates them between decoder layers. We point out a simple yet effective positional point label form to tackle the side effect of the original one. Moreover, an Enhanced Factorized Self-Attention module is designed to explicitly model the circular shape of polygon point sequences beyond non-local attention. Extensive experiments prove the training efficiency, robustness, and state-of-the-art performance on various arbitrary shape scene text benchmarks. Beyond detector, we observe that existing end-to-end spotters struggle to recognize inverse-like texts. To evaluate their performance objectively and facilitate future research, we propose an Inverse-Text test set containing 500 manually labeled images. The code and Inverse-Text test set will be available at https://github.com/ymy-k/DPText-DETR.

View paper on

Share this with someone who'll enjoy it:

Title:DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer

Paper and Code