Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Eunjeong L. Park

A Multilingual Neural Machine Translation Model for Biomedical Data

Aug 06, 2020

Alexandre Bérard, Zae Myung Kim, Vassilina Nikoulina, Eunjeong L. Park, Matthias Gallé

Figure 1 for A Multilingual Neural Machine Translation Model for Biomedical Data

Figure 2 for A Multilingual Neural Machine Translation Model for Biomedical Data

Figure 3 for A Multilingual Neural Machine Translation Model for Biomedical Data

Abstract:We release a multilingual neural machine translation model, which can be used to translate text in the biomedical domain. The model can translate from 5 languages (French, German, Italian, Korean and Spanish) into English. It is trained with large amounts of generic and biomedical data, using domain tags. Our benchmarks show that it performs near state-of-the-art both on news (generic domain) and biomedical test sets, and that it outperforms the existing publicly released models. We believe that this release will help the large-scale multilingual analysis of the digital content of the COVID-19 crisis and of its effects on society, economy, and healthcare policies. We also release a test set of biomedical text for Korean-English. It consists of 758 sentences from official guidelines and recent papers, all about COVID-19.

* https://github.com/naver/covid19-nmt

Via

Access Paper or Ask Questions

Revisiting Round-Trip Translation for Quality Estimation

Apr 29, 2020

Jihyung Moon, Hyunchang Cho, Eunjeong L. Park

Figure 1 for Revisiting Round-Trip Translation for Quality Estimation

Figure 2 for Revisiting Round-Trip Translation for Quality Estimation

Figure 3 for Revisiting Round-Trip Translation for Quality Estimation

Figure 4 for Revisiting Round-Trip Translation for Quality Estimation

Abstract:Quality estimation (QE) is the task of automatically evaluating the quality of translations without human-translated references. Calculating BLEU between the input sentence and round-trip translation (RTT) was once considered as a metric for QE, however, it was found to be a poor predictor of translation quality. Recently, various pre-trained language models have made breakthroughs in NLP tasks by providing semantically meaningful word and sentence embeddings. In this paper, we employ semantic embeddings to RTT-based QE. Our method achieves the highest correlations with human judgments, compared to previous WMT 2019 quality estimation metric task submissions. While backward translation models can be a drawback when using RTT, we observe that with semantic-level metrics, RTT-based QE is robust to the choice of the backward translation system. Additionally, the proposed method shows consistent performance for both SMT and NMT forward translation systems, implying the method does not penalize a certain type of model.

* To be published in EAMT 2020

Via

Access Paper or Ask Questions