Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Evgeny Matusov

Findings of the IWSLT 2024 Evaluation Campaign

Nov 07, 2024

Ibrahim Said Ahmad, Antonios Anastasopoulos, Ondřej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, William Chen, Qianqian Dong, Marcello Federico(+35 more)

Abstract:This paper reports on the shared tasks organized by the 21st IWSLT Conference. The shared tasks address 7 scientific challenges in spoken language translation: simultaneous and offline translation, automatic subtitling and dubbing, speech-to-speech translation, dialect and low-resource speech translation, and Indic languages. The shared tasks attracted 18 teams whose submissions are documented in 26 system papers. The growing interest towards spoken language translation is also witnessed by the constantly increasing number of shared task organizers and contributors to the overview paper, almost evenly distributed across industry and academia.

* IWSLT 2024; 59 pages

Via

Access Paper or Ask Questions

ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change

Jan 17, 2024

David Thulke, Yingbo Gao, Petrus Pelser, Rein Brune, Rricha Jalota, Floris Fok, Michael Ramos, Ian van Wyk, Abdallah Nasir, Hayden Goldstein(+16 more)

Abstract:This paper introduces ClimateGPT, a model family of domain-specific large language models that synthesize interdisciplinary research on climate change. We trained two 7B models from scratch on a science-oriented dataset of 300B tokens. For the first model, the 4.2B domain-specific tokens were included during pre-training and the second was adapted to the climate domain after pre-training. Additionally, ClimateGPT-7B, 13B and 70B are continuously pre-trained from Llama~2 on a domain-specific dataset of 4.2B tokens. Each model is instruction fine-tuned on a high-quality and human-generated domain-specific dataset that has been created in close cooperation with climate scientists. To reduce the number of hallucinations, we optimize the model for retrieval augmentation and propose a hierarchical retrieval strategy. To increase the accessibility of our model to non-English speakers, we propose to make use of cascaded machine translation and show that this approach can perform comparably to natively multilingual models while being easier to scale to a large number of languages. Further, to address the intrinsic interdisciplinary aspect of climate change we consider different research perspectives. Therefore, the model can produce in-depth answers focusing on different perspectives in addition to an overall answer. We propose a suite of automatic climate-specific benchmarks to evaluate LLMs. On these benchmarks, ClimateGPT-7B performs on par with the ten times larger Llama-2-70B Chat model while not degrading results on general domain benchmarks. Our human evaluation confirms the trends we saw in our benchmarks. All models were trained and evaluated using renewable energy and are released publicly.

Via

Access Paper or Ask Questions

AppTek's Submission to the IWSLT 2022 Isometric Spoken Language Translation Task

May 12, 2022

Patrick Wilken, Evgeny Matusov

Figure 1 for AppTek's Submission to the IWSLT 2022 Isometric Spoken Language Translation Task

Figure 2 for AppTek's Submission to the IWSLT 2022 Isometric Spoken Language Translation Task

Abstract:To participate in the Isometric Spoken Language Translation Task of the IWSLT 2022 evaluation, constrained condition, AppTek developed neural Transformer-based systems for English-to-German with various mechanisms of length control, ranging from source-side and target-side pseudo-tokens to encoding of remaining length in characters that replaces positional encoding. We further increased translation length compliance by sentence-level selection of length-compliant hypotheses from different system variants, as well as rescoring of N-best candidates from a single system. Length-compliant back-translated and forward-translated synthetic data, as well as other parallel data variants derived from the original MuST-C training corpus were important for a good quality/desired length trade-off. Our experimental results show that length compliance levels above 90% can be reached while minimizing losses in MT quality as measured in BERT and BLEU scores.

* IWSLT 2022

Via

Access Paper or Ask Questions

SubER: A Metric for Automatic Evaluation of Subtitle Quality

May 11, 2022

Patrick Wilken, Panayota Georgakopoulou, Evgeny Matusov

Figure 1 for SubER: A Metric for Automatic Evaluation of Subtitle Quality

Figure 2 for SubER: A Metric for Automatic Evaluation of Subtitle Quality

Figure 3 for SubER: A Metric for Automatic Evaluation of Subtitle Quality

Figure 4 for SubER: A Metric for Automatic Evaluation of Subtitle Quality

Abstract:This paper addresses the problem of evaluating the quality of automatically generated subtitles, which includes not only the quality of the machine-transcribed or translated speech, but also the quality of line segmentation and subtitle timing. We propose SubER - a single novel metric based on edit distance with shifts that takes all of these subtitle properties into account. We compare it to existing metrics for evaluating transcription, translation, and subtitle quality. A careful human evaluation in a post-editing scenario shows that the new metric has a high correlation with the post-editing effort and direct human assessment scores, outperforming baseline metrics considering only the subtitle text, such as WER and BLEU, and existing methods to integrate segmentation and timing features.

* IWSLT 2022

Via

Access Paper or Ask Questions

Neural Simultaneous Speech Translation Using Alignment-Based Chunking

May 29, 2020

Patrick Wilken, Tamer Alkhouli, Evgeny Matusov, Pavel Golik

Figure 1 for Neural Simultaneous Speech Translation Using Alignment-Based Chunking

Figure 2 for Neural Simultaneous Speech Translation Using Alignment-Based Chunking

Figure 3 for Neural Simultaneous Speech Translation Using Alignment-Based Chunking

Figure 4 for Neural Simultaneous Speech Translation Using Alignment-Based Chunking

Abstract:In simultaneous machine translation, the objective is to determine when to produce a partial translation given a continuous stream of source words, with a trade-off between latency and quality. We propose a neural machine translation (NMT) model that makes dynamic decisions when to continue feeding on input or generate output words. The model is composed of two main components: one to dynamically decide on ending a source chunk, and another that translates the consumed chunk. We train the components jointly and in a manner consistent with the inference conditions. To generate chunked training data, we propose a method that utilizes word alignment while also preserving enough context. We compare models with bidirectional and unidirectional encoders of different depths, both on real speech and text input. Our results on the IWSLT 2020 English-to-German task outperform a wait-k baseline by 2.6 to 3.7% BLEU absolute.

* IWSLT 2020

Via

Access Paper or Ask Questions

Novel Applications of Factored Neural Machine Translation

Oct 09, 2019

Patrick Wilken, Evgeny Matusov

Figure 1 for Novel Applications of Factored Neural Machine Translation

Figure 2 for Novel Applications of Factored Neural Machine Translation

Figure 3 for Novel Applications of Factored Neural Machine Translation

Figure 4 for Novel Applications of Factored Neural Machine Translation

Abstract:In this work, we explore the usefulness of target factors in neural machine translation (NMT) beyond their original purpose of predicting word lemmas and their inflections, as proposed by Garc\`ia-Mart\`inez et al., 2016. For this, we introduce three novel applications of the factored output architecture: In the first one, we use a factor to explicitly predict the word case separately from the target word itself. This allows for information to be shared between different casing variants of a word. In a second task, we use a factor to predict when two consecutive subwords have to be joined, eliminating the need for target subword joining markers. The third task is the prediction of special tokens of the operation sequence NMT model (OSNMT) of Stahlberg et al., 2018. Automatic evaluation on English-to-German and English-to-Turkish tasks showed that integration of such auxiliary prediction tasks into NMT is at least as good as the standard NMT approach. For the OSNMT, we observed a significant improvement in BLEU over the baseline OSNMT implementation due to a reduced output sequence length that resulted from the introduction of the target factors.

Via

Access Paper or Ask Questions

Learning from Chunk-based Feedback in Neural Machine Translation

Jun 19, 2018

Pavel Petrushkov, Shahram Khadivi, Evgeny Matusov

Figure 1 for Learning from Chunk-based Feedback in Neural Machine Translation

Figure 2 for Learning from Chunk-based Feedback in Neural Machine Translation

Abstract:We empirically investigate learning from partial feedback in neural machine translation (NMT), when partial feedback is collected by asking users to highlight a correct chunk of a translation. We propose a simple and effective way of utilizing such feedback in NMT training. We demonstrate how the common machine translation problem of domain mismatch between training and deployment can be reduced solely based on chunk-level user feedback. We conduct a series of simulation experiments to test the effectiveness of the proposed method. Our results show that chunk-level feedback outperforms sentence based feedback by up to 2.61% BLEU absolute.

* the paper accepted in ACL 2018 Conference, Melbourne, Australia

Via

Access Paper or Ask Questions

Can Neural Machine Translation be Improved with User Feedback?

Apr 16, 2018

Julia Kreutzer, Shahram Khadivi, Evgeny Matusov, Stefan Riezler

Figure 1 for Can Neural Machine Translation be Improved with User Feedback?

Figure 2 for Can Neural Machine Translation be Improved with User Feedback?

Figure 3 for Can Neural Machine Translation be Improved with User Feedback?

Figure 4 for Can Neural Machine Translation be Improved with User Feedback?

Abstract:We present the first real-world application of methods for improving neural machine translation (NMT) with human reinforcement, based on explicit and implicit user feedback collected on the eBay e-commerce platform. Previous work has been confined to simulation experiments, whereas in this paper we work with real logged feedback for offline bandit learning of NMT parameters. We conduct a thorough analysis of the available explicit user judgments---five-star ratings of translation quality---and show that they are not reliable enough to yield significant improvements in bandit learning. In contrast, we successfully utilize implicit task-based feedback collected in a cross-lingual search task to improve task-specific and machine translation quality metrics.

* Accepted at NAACL-HLT 2018 (Industry Track)

Via

Access Paper or Ask Questions

Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search

Aug 10, 2017

Leonard Dahlmann, Evgeny Matusov, Pavel Petrushkov, Shahram Khadivi

Figure 1 for Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search

Figure 2 for Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search

Figure 3 for Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search

Figure 4 for Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search

Abstract:In this paper, we introduce a hybrid search for attention-based neural machine translation (NMT). A target phrase learned with statistical MT models extends a hypothesis in the NMT beam search when the attention of the NMT model focuses on the source words translated by this phrase. Phrases added in this way are scored with the NMT model, but also with SMT features including phrase-level translation probabilities and a target language model. Experimental results on German->English news domain and English->Russian e-commerce domain translation tasks show that using phrase-based models in NMT search improves MT quality by up to 2.3% BLEU absolute as compared to a strong NMT baseline.

* To appear in Proceedings of EMNLP 2017

Via

Access Paper or Ask Questions

Neural and Statistical Methods for Leveraging Meta-information in Machine Translation

Aug 10, 2017

Shahram Khadivi, Patrick Wilken, Leonard Dahlmann, Evgeny Matusov

Figure 1 for Neural and Statistical Methods for Leveraging Meta-information in Machine Translation

Figure 2 for Neural and Statistical Methods for Leveraging Meta-information in Machine Translation

Figure 3 for Neural and Statistical Methods for Leveraging Meta-information in Machine Translation

Figure 4 for Neural and Statistical Methods for Leveraging Meta-information in Machine Translation

Abstract:In this paper, we discuss different methods which use meta information and richer context that may accompany source language input to improve machine translation quality. We focus on category information of input text as meta information, but the proposed methods can be extended to all textual and non-textual meta information that might be available for the input text or automatically predicted using the text content. The main novelty of this work is to use state-of-the-art neural network methods to tackle this problem within a statistical machine translation (SMT) framework. We observe translation quality improvements up to 3% in terms of BLEU score in some text categories.

* To appear in MT Summit 2017

Via

Access Paper or Ask Questions