Abstract:Unsupervised learning objectives like language modeling and de-noising constitute a significant part in producing pre-trained models that perform various downstream applications from natural language understanding to conversational tasks. However, despite impressive generative capabilities of recent large language models, their abilities to capture syntactic or semantic structure within text lag behind. We hypothesize that the mismatch between linguistic performance and competence in machines is attributable to insufficient transfer of linguistic structure knowledge to computational systems with currently popular pre-training objectives. We show that punctuation restoration as a learning objective improves in- and out-of-distribution performance on structure-related tasks like named entity recognition, open information extraction, chunking, and part-of-speech tagging. Punctuation restoration is an effective learning objective that can improve structure understanding and yield a more robust structure-aware representations of natural language.
Abstract:Previous work in structured prediction (e.g. NER, information extraction) using single model make use of explicit dataset information, which helps boost in-distribution performance but is orthogonal to robust generalization in real-world situations. To overcome this limitation, we propose the Structured Language Generation Model (SLGM), a framework that reduces sequence-to-sequence problems to classification problems via methodologies in loss calibration and decoding method. Our experimental results show that SLGM is able to maintain performance without explicit dataset information, follow and potentially replace dataset-specific fine-tuning.
Abstract:After the COVID-19 outbreak, it has become important to automatically detect whether people are wearing masks in order to reduce risk of front-line workers. In addition, processing user data locally is a great way to address both privacy and network bandwidth issues. In this paper, we present a light-weighted model for detecting whether people in a particular area wear masks, which can also be deployed on Coral Dev Board, a commercially available development board containing Google Edge TPU. Our approach combines the object detecting network based on MobileNetV2 plus SSD and the quantization scheme for integer-only hardware. As a result, the lighter model in the Edge TPU has a significantly lower latency which is more appropriate for real-time execution while maintaining accuracy comparable to a floating point device.