Abstract:Chinese spelling check is a task to detect and correct spelling mistakes in Chinese text. Existing research aims to enhance the text representation and use multi-source information to improve the detection and correction capabilities of models, but does not pay too much attention to improving their ability to distinguish between confusable words. Contrastive learning, whose aim is to minimize the distance in representation space between similar sample pairs, has recently become a dominant technique in natural language processing. Inspired by contrastive learning, we present a novel framework for Chinese spelling checking, which consists of three modules: language representation, spelling check and reverse contrastive learning. Specifically, we propose a reverse contrastive learning strategy, which explicitly forces the model to minimize the agreement between the similar examples, namely, the phonetically and visually confusable characters. Experimental results show that our framework is model-agnostic and could be combined with existing Chinese spelling check models to yield state-of-the-art performance.
Abstract:Chinese features prominently in the Chinese communities located in the nations of Malay Archipelago. In these countries, Chinese has undergone the process of adjustment to the local languages and cultures, which leads to the occurrence of a Chinese variant in each country. In this paper, we conducted a quantitative analysis on Chinese news texts collected from five Malay Archipelago nations, namely Indonesia, Malaysia, Singapore, Philippines and Brunei, trying to figure out their differences with the texts written in modern standard Chinese from a lexical and syntactic perspective. The statistical results show that the Chinese variants used in these five nations are quite different, diverging from their modern Chinese mainland counterpart. Meanwhile, we managed to extract and classify several featured Chinese words used in each nation. All these discrepancies reflect how Chinese evolves overseas, and demonstrate the profound impact rom local societies and cultures on the development of Chinese.