Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Baban Gain

Beyond the Sentence: A Survey on Context-Aware Machine Translation with Large Language Models

Jun 09, 2025

Ramakrishna Appicharla, Baban Gain, Santanu Pal, Asif Ekbal

Abstract:Despite the popularity of the large language models (LLMs), their application to machine translation is relatively underexplored, especially in context-aware settings. This work presents a literature review of context-aware translation with LLMs. The existing works utilise prompting and fine-tuning approaches, with few focusing on automatic post-editing and creating translation agents for context-aware machine translation. We observed that the commercial LLMs (such as ChatGPT and Tower LLM) achieved better results than the open-source LLMs (such as Llama and Bloom LLMs), and prompt-based approaches serve as good baselines to assess the quality of translations. Finally, we present some interesting future directions to explore.

Via

Access Paper or Ask Questions

Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation

Apr 03, 2025

Baban Gain, Dibyanayan Bandyopadhyay, Asif Ekbal

Abstract:The advent of Large Language Models (LLMs) has significantly reshaped the landscape of machine translation (MT), particularly for low-resource languages and domains that lack sufficient parallel corpora, linguistic tools, and computational infrastructure. This survey presents a comprehensive overview of recent progress in leveraging LLMs for MT. We analyze techniques such as few-shot prompting, cross-lingual transfer, and parameter-efficient fine-tuning that enable effective adaptation to under-resourced settings. The paper also explores synthetic data generation strategies using LLMs, including back-translation and lexical augmentation. Additionally, we compare LLM-based translation with traditional encoder-decoder models across diverse language pairs, highlighting the strengths and limitations of each. We discuss persistent challenges such as hallucinations, evaluation inconsistencies, and inherited biases while also evaluating emerging LLM-driven metrics for translation quality. This survey offers practical insights and outlines future directions for building robust, inclusive, and scalable MT systems in the era of large-scale generative models.

Via

Access Paper or Ask Questions

Quality Estimation based Feedback Training for Improving Pronoun Translation

Jan 06, 2025

Harshit Dhankhar, Baban Gain, Asif Ekbal, Yogesh Mani Tripathi

Figure 1 for Quality Estimation based Feedback Training for Improving Pronoun Translation

Figure 2 for Quality Estimation based Feedback Training for Improving Pronoun Translation

Figure 3 for Quality Estimation based Feedback Training for Improving Pronoun Translation

Figure 4 for Quality Estimation based Feedback Training for Improving Pronoun Translation

Abstract:Pronoun translation is a longstanding challenge in neural machine translation (NMT), often requiring inter-sentential context to ensure linguistic accuracy. To address this, we introduce ProNMT, a novel framework designed to enhance pronoun and overall translation quality in context-aware machine translation systems. ProNMT leverages Quality Estimation (QE) models and a unique Pronoun Generation Likelihood-Based Feedback mechanism to iteratively fine-tune pre-trained NMT models without relying on extensive human annotations. The framework combines QE scores with pronoun-specific rewards to guide training, ensuring improved handling of linguistic nuances. Extensive experiments demonstrate significant gains in pronoun translation accuracy and general translation quality across multiple metrics. ProNMT offers an efficient, scalable, and context-aware approach to improving NMT systems, particularly in translating context-dependent elements like pronouns.

Via

Access Paper or Ask Questions

A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning

Jul 03, 2024

Ramakrishna Appicharla, Baban Gain, Santanu Pal, Asif Ekbal, Pushpak Bhattacharyya

Figure 1 for A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning

Figure 2 for A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning

Figure 3 for A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning

Figure 4 for A Case Study on Context-Aware Neural Machine Translation with Multi-Task Learning

Abstract:In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies \cite{li-etal-2020-multi-encoder} have shown that the context encoder generates noise and makes the model robust to the choice of context. This paper further investigates this observation by explicitly modelling context encoding through multi-task learning (MTL) to make the model sensitive to the choice of context. We conduct experiments on cascade MTL architecture, which consists of one encoder and two decoders. Generation of the source from the context is considered an auxiliary task, and generation of the target from the source is the main task. We experimented with German--English language pairs on News, TED, and Europarl corpora. Evaluation results show that the proposed MTL approach performs better than concatenation-based and multi-encoder DocNMT models in low-resource settings and is sensitive to the choice of context. However, we observe that the MTL models are failing to generate the source from the context. These observations align with the previous studies, and this might suggest that the available document-level parallel corpora are not context-aware, and a robust sentence-level model can outperform the context-aware models.

* Accepted to EAMT 2024 (poster)

Via

Access Paper or Ask Questions

Universal Adversarial Framework to Improve Adversarial Robustness for Diabetic Retinopathy Detection

Dec 13, 2023

Samrat Mukherjee, Dibyanayan Bandyopadhyay, Baban Gain, Asif Ekbal

Figure 1 for Universal Adversarial Framework to Improve Adversarial Robustness for Diabetic Retinopathy Detection

Figure 2 for Universal Adversarial Framework to Improve Adversarial Robustness for Diabetic Retinopathy Detection

Figure 3 for Universal Adversarial Framework to Improve Adversarial Robustness for Diabetic Retinopathy Detection

Figure 4 for Universal Adversarial Framework to Improve Adversarial Robustness for Diabetic Retinopathy Detection

Abstract:Diabetic Retinopathy (DR) is a prevalent illness associated with Diabetes which, if left untreated, can result in irreversible blindness. Deep Learning based systems are gradually being introduced as automated support for clinical diagnosis. Since healthcare has always been an extremely important domain demanding error-free performance, any adversaries could pose a big threat to the applicability of such systems. In this work, we use Universal Adversarial Perturbations (UAPs) to quantify the vulnerability of Medical Deep Neural Networks (DNNs) for detecting DR. To the best of our knowledge, this is the very first attempt that works on attacking complete fine-grained classification of DR images using various UAPs. Also, as a part of this work, we use UAPs to fine-tune the trained models to defend against adversarial samples. We experiment on several models and observe that the performance of such models towards unseen adversarial attacks gets boosted on average by $3.41$ Cohen-kappa value and maximum by $31.92$ Cohen-kappa value. The performance degradation on normal data upon ensembling the fine-tuned models was found to be statistically insignificant using t-test, highlighting the benefits of UAP-based adversarial fine-tuning.

Via

Access Paper or Ask Questions

Reference Free Domain Adaptation for Translation of Noisy Questions with Question Specific Rewards

Oct 23, 2023

Baban Gain, Ramakrishna Appicharla, Soumya Chennabasavaraj, Nikesh Garera, Asif Ekbal, Muthusamy Chelliah

Abstract:Community Question-Answering (CQA) portals serve as a valuable tool for helping users within an organization. However, making them accessible to non-English-speaking users continues to be a challenge. Translating questions can broaden the community's reach, benefiting individuals with similar inquiries in various languages. Translating questions using Neural Machine Translation (NMT) poses more challenges, especially in noisy environments, where the grammatical correctness of the questions is not monitored. These questions may be phrased as statements by non-native speakers, with incorrect subject-verb order and sometimes even missing question marks. Creating a synthetic parallel corpus from such data is also difficult due to its noisy nature. To address this issue, we propose a training methodology that fine-tunes the NMT system only using source-side data. Our approach balances adequacy and fluency by utilizing a loss function that combines BERTScore and Masked Language Model (MLM) Score. Our method surpasses the conventional Maximum Likelihood Estimation (MLE) based fine-tuning approach, which relies on synthetic target data, by achieving a 1.9 BLEU score improvement. Our model exhibits robustness while we add noise to our baseline, and still achieve 1.1 BLEU improvement and large improvements on TER and BLEURT metrics. Our proposed methodology is model-agnostic and is only necessary during the training phase. We make the codes and datasets publicly available at \url{https://www.iitp.ac.in/~ai-nlp-ml/resources.html#DomainAdapt} for facilitating further research.

* Published at: Findings of EMNLP 2023

Via

Access Paper or Ask Questions

Impact of Visual Context on Noisy Multimodal NMT: An Empirical Study for English to Indian Languages

Aug 30, 2023

Baban Gain, Dibyanayan Bandyopadhyay, Samrat Mukherjee, Chandranath Adak, Asif Ekbal

Abstract:The study investigates the effectiveness of utilizing multimodal information in Neural Machine Translation (NMT). While prior research focused on using multimodal data in low-resource scenarios, this study examines how image features impact translation when added to a large-scale, pre-trained unimodal NMT system. Surprisingly, the study finds that images might be redundant in this context. Additionally, the research introduces synthetic noise to assess whether images help the model deal with textual noise. Multimodal models slightly outperform text-only models in noisy settings, even with random images. The study's experiments translate from English to Hindi, Bengali, and Malayalam, outperforming state-of-the-art benchmarks significantly. Interestingly, the effect of visual context varies with source text noise: no visual context works best for non-noisy translations, cropped image features are optimal for low noise, and full image features work better in high-noise scenarios. This sheds light on the role of visual context, especially in noisy settings, opening up a new research direction for Noisy Neural Machine Translation in multimodal setups. The research emphasizes the importance of combining visual and textual information for improved translation in various environments.

Via

Access Paper or Ask Questions

A Case Study on Context Encoding in Multi-Encoder based Document-Level Neural Machine Translation

Aug 11, 2023

Ramakrishna Appicharla, Baban Gain, Santanu Pal, Asif Ekbal

Abstract:Recent studies have shown that the multi-encoder models are agnostic to the choice of context, and the context encoder generates noise which helps improve the models in terms of BLEU score. In this paper, we further explore this idea by evaluating with context-aware pronoun translation test set by training multi-encoder models trained on three different context settings viz, previous two sentences, random two sentences, and a mix of both as context. Specifically, we evaluate the models on the ContraPro test set to study how different contexts affect pronoun translation accuracy. The results show that the model can perform well on the ContraPro test set even when the context is random. We also analyze the source representations to study whether the context encoder generates noise. Our analysis shows that the context encoder provides sufficient information to learn discourse-level information. Additionally, we observe that mixing the selected context (the previous two sentences in this case) and the random context is generally better than the other settings.

* Accepted to MT Summit 2023 (oral)

Via

Access Paper or Ask Questions

IITP at WAT 2021: System description for English-Hindi Multimodal Translation Task

Jul 04, 2021

Baban Gain, Dibyanayan Bandyopadhyay, Asif Ekbal

Figure 1 for IITP at WAT 2021: System description for English-Hindi Multimodal Translation Task

Figure 2 for IITP at WAT 2021: System description for English-Hindi Multimodal Translation Task

Figure 3 for IITP at WAT 2021: System description for English-Hindi Multimodal Translation Task

Figure 4 for IITP at WAT 2021: System description for English-Hindi Multimodal Translation Task

Abstract:Neural Machine Translation (NMT) is a predominant machine translation technology nowadays because of its end-to-end trainable flexibility. However, NMT still struggles to translate properly in low-resource settings specifically on distant language pairs. One way to overcome this is to use the information from other modalities if available. The idea is that despite differences in languages, both the source and target language speakers see the same thing and the visual representation of both the source and target is the same, which can positively assist the system. Multimodal information can help the NMT system to improve the translation by removing ambiguity on some phrases or words. We participate in the 8th Workshop on Asian Translation (WAT - 2021) for English-Hindi multimodal translation task and achieve 42.47 and 37.50 BLEU points for Evaluation and Challenge subset, respectively.

Via

Access Paper or Ask Questions

IITP at AILA 2019: System Report for Artificial Intelligence for Legal Assistance Shared Task

May 24, 2021

Baban Gain, Dibyanayan Bandyopadhyay, Arkadipta De, Tanik Saikh, Asif Ekbal

Figure 1 for IITP at AILA 2019: System Report for Artificial Intelligence for Legal Assistance Shared Task

Abstract:In this article, we present a description of our systems as a part of our participation in the shared task namely Artificial Intelligence for Legal Assistance (AILA 2019). This is an integral event of Forum for Information Retrieval Evaluation-2019. The outcomes of this track would be helpful for the automation of the working process of the Indian Judiciary System. The manual working procedures and documentation at any level (from lower to higher court) of the judiciary system are very complex in nature. The systems produced as a part of this track would assist the law practitioners. It would be helpful for common men too. This kind of track also opens the path of research of Natural Language Processing (NLP) in the judicial domain. This track defined two problems such as Task 1: Identifying relevant prior cases for a given situation and Task 2: Identifying the most relevant statutes for a given situation. We tackled both of them. Our proposed approaches are based on BM25 and Doc2Vec. As per the results declared by the task organizers, we are in 3rd and a modest position in Task 1 and Task 2 respectively.

Via

Access Paper or Ask Questions