Abstract:Most existing traffic sign-related works are dedicated to detecting and recognizing part of traffic signs individually, which fails to analyze the global semantic logic among signs and may convey inaccurate traffic instruction. Following the above issues, we propose a traffic sign interpretation (TSI) task, which aims to interpret global semantic interrelated traffic signs (e.g.,~driving instruction-related texts, symbols, and guide panels) into a natural language for providing accurate instruction support to autonomous or assistant driving. Meanwhile, we design a multi-task learning architecture for TSI, which is responsible for detecting and recognizing various traffic signs and interpreting them into a natural language like a human. Furthermore, the absence of a public TSI available dataset prompts us to build a traffic sign interpretation dataset, namely TSI-CN. The dataset consists of real road scene images, which are captured from the highway and the urban way in China from a driver's perspective. It contains rich location labels of texts, symbols, and guide panels, and the corresponding natural language description labels. Experiments on TSI-CN demonstrate that the TSI task is achievable and the TSI architecture can interpret traffic signs from scenes successfully even if there is a complex semantic logic among signs. The TSI-CN dataset and the source code of the TSI architecture will be publicly available after the revision process.
Abstract:Contour-based instance segmentation methods include one-stage and multi-stage schemes. These approaches achieve remarkable performance. However, they have to define plenty of points to segment precise masks, which leads to high complexity. We follow this issue and present a single-shot method, called \textbf{VeinMask}, for achieving competitive performance in low design complexity. Concretely, we observe that the leaf locates coarse margins via major veins and grows minor veins to refine twisty parts, which makes it possible to cover any objects accurately. Meanwhile, major and minor veins share the same growth mode, which avoids modeling them separately and ensures model simplicity. Considering the superiorities above, we propose VeinMask to formulate the instance segmentation problem as the simulation of the vein growth process and to predict the major and minor veins in polar coordinates. Besides, centroidness is introduced for instance segmentation tasks to help suppress low-quality instances. Furthermore, a surroundings cross-correlation sensitive (SCCS) module is designed to enhance the feature expression by utilizing the surroundings of each pixel. Additionally, a Residual IoU (R-IoU) loss is formulated to supervise the regression tasks of major and minor veins effectively. Experiments demonstrate that VeinMask performs much better than other contour-based methods in low design complexity. Particularly, our method outperforms existing one-stage contour-based methods on the COCO dataset with almost half the design complexity.