Abstract:For crosslingual conversation and trade, Neural Machine Translation (NMT) is pivotal yet faces persistent challenges with monotony and repetition in generated content. Traditional solutions that rely on penalizing text redundancy or token reoccurrence have shown limited efficacy, particularly for lengthy article and e-commerce descriptions with inherent redundancy, even with the advent of Large Language Models (LLMs). This paper investigates the underlying causes of textual repetition through the lens of information entropy, attributing the phenomenon to the elevated uncertainty within the input text. To address this, a novel algorithm named Contrastive Token Learning with Similarity Decay (CTSD) is introduced, which modulates the suppression of tokens dynamically, informed by varying attention weights and inter-token distances. Furthermore, an e-commerce dataset comprised of title texts of online real items is compiled and released susceptible to hallucination translations to benchmark the algorithm. Extensive evaluations demonstrate that CTSD significantly outperforms existing approaches in precision and generalizability. Additional online A/B testing underscores its practical value, showing marked improvements in user engagement and conversion. Notably, this method has been implemented with full traffic on eight multilingual sites of alibaba.com, the largest B2B e-commerce platform in the world.
Abstract:Existing Neural Machine Translation (NMT) models mainly handle translation in the general domain, while overlooking domains with special writing formulas, such as e-commerce and legal documents. Taking e-commerce as an example, the texts usually include amounts of domain-related words and have more grammar problems, which leads to inferior performances of current NMT methods. To address these problems, we collect two domain-related resources, including a set of term pairs (aligned Chinese-English bilingual terms) and a parallel corpus annotated for the e-commerce domain. Furthermore, we propose a two-step fine-tuning paradigm (named G2ST) with self-contrastive semantic enhancement to transfer one general NMT model to the specialized NMT model for e-commerce. The paradigm can be used for the NMT models based on Large language models (LLMs). Extensive evaluations on real e-commerce titles demonstrate the superior translation quality and robustness of our G2ST approach, as compared with state-of-the-art NMT models such as LLaMA, Qwen, GPT-3.5, and even GPT-4.