Abstract:Designing an automatic checkout system for retail stores at the human level accuracy is challenging due to similar appearance products and their various poses. This paper addresses the problem by proposing a method with a two-stage pipeline. The first stage detects class-agnostic items, and the second one is dedicated to classify product categories. We also track the objects across video frames to avoid duplicated counting. One major challenge is the domain gap because the models are trained on synthetic data but tested on the real images. To reduce the error gap, we adopt domain generalization methods for the first-stage detector. In addition, model ensemble is used to enhance the robustness of the 2nd-stage classifier. The method is evaluated on the AI City challenge 2022 -- Track 4 and gets the F1 score $40\%$ on the test A set. Code is released at the link https://github.com/cybercore-co-ltd/aicity22-track4.
Abstract:In this paper, we propose a Hierarchical Transformer model for Vietnamese spelling correction problem. The model consists of multiple Transformer encoders and utilizes both character-level and word-level to detect errors and make corrections. In addition, to facilitate future work in Vietnamese spelling correction tasks, we propose a realistic dataset collected from real-life texts for the problem. We compare our method with other methods and publicly available systems. The proposed method outperforms all of the contemporary methods in terms of recall, precision, and f1-score. A demo version is publicly available.