Abstract:Most of the current task-oriented dialogue systems (ToD), despite having interesting results, are designed for a handful of languages like Chinese and English. Therefore, their performance in low-resource languages is still a significant problem due to the absence of a standard dataset and evaluation policy. To address this problem, we proposed ViWOZ, a fully-annotated Vietnamese task-oriented dialogue dataset. ViWOZ is the first multi-turn, multi-domain tasked oriented dataset in Vietnamese, a low-resource language. The dataset consists of a total of 5,000 dialogues, including 60,946 fully annotated utterances. Furthermore, we provide a comprehensive benchmark of both modular and end-to-end models in low-resource language scenarios. With those characteristics, the ViWOZ dataset enables future studies on creating a multilingual task-oriented dialogue system.
Abstract:People detection in top-view, fish-eye images is challenging as people in fish-eye images often appear in arbitrary directions and are distorted differently. Due to this unique radial geometry, axis-aligned people detectors often work poorly on fish-eye frames. Recent works account for this variability by modifying existing anchor-based detectors or relying on complex pre/post-processing. Anchor-based methods spread a set of pre-defined bounding boxes on the input image, most of which are invalid. In addition to being inefficient, this approach could lead to a significant imbalance between the positive and negative anchor boxes. In this work, we propose ARPD, a single-stage anchor-free fully convolutional network to detect arbitrarily rotated people in fish-eye images. Our network uses keypoint estimation to find the center point of each object and regress the object's other properties directly. To capture the various orientation of people in fish-eye cameras, in addition to the center and size, ARPD also predicts the angle of each bounding box. We also propose a periodic loss function that accounts for angle periodicity and relieves the difficulty of learning small-angle oscillations. Experimental results show that our method competes favorably with state-of-the-art algorithms while running significantly faster.