Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sheng Liang

From Classification to Generation: Insights into Crosslingual Retrieval Augmented ICL

Nov 14, 2023

Xiaoqian Li, Ercong Nie, Sheng Liang

Figure 1 for From Classification to Generation: Insights into Crosslingual Retrieval Augmented ICL

Figure 2 for From Classification to Generation: Insights into Crosslingual Retrieval Augmented ICL

Figure 3 for From Classification to Generation: Insights into Crosslingual Retrieval Augmented ICL

Figure 4 for From Classification to Generation: Insights into Crosslingual Retrieval Augmented ICL

Abstract:The remarkable ability of Large Language Models (LLMs) to understand and follow instructions has sometimes been limited by their in-context learning (ICL) performance in low-resource languages. To address this, we introduce a novel approach that leverages cross-lingual retrieval-augmented in-context learning (CREA-ICL). By extracting semantically similar prompts from high-resource languages, we aim to improve the zero-shot performance of multilingual pre-trained language models (MPLMs) across diverse tasks. Though our approach yields steady improvements in classification tasks, it faces challenges in generation tasks. Our evaluation offers insights into the performance dynamics of retrieval-augmented in-context learning across both classification and generation domains.

* In The Workshop on Instruction Tuning and Instruction Following, held in conjunction with The Conference on NeurIPS 2023, December 2023. arXiv admin note: text overlap with arXiv:2311.00587

Via

Access Paper or Ask Questions

Crosslingual Retrieval Augmented In-context Learning for Bangla

Nov 01, 2023

Xiaoqian Li, Ercong Nie, Sheng Liang

Figure 1 for Crosslingual Retrieval Augmented In-context Learning for Bangla

Figure 2 for Crosslingual Retrieval Augmented In-context Learning for Bangla

Figure 3 for Crosslingual Retrieval Augmented In-context Learning for Bangla

Figure 4 for Crosslingual Retrieval Augmented In-context Learning for Bangla

Abstract:The promise of Large Language Models (LLMs) in Natural Language Processing has often been overshadowed by their limited performance in low-resource languages such as Bangla. To address this, our paper presents a pioneering approach that utilizes cross-lingual retrieval augmented in-context learning. By strategically sourcing semantically similar prompts from high-resource language, we enable multilingual pretrained language models (MPLMs), especially the generative model BLOOMZ, to successfully boost performance on Bangla tasks. Our extensive evaluation highlights that the cross-lingual retrieval augmented prompts bring steady improvements to MPLMs over the zero-shot performance.

* In The 1st Bangla Language Processing (BLP) Workshop, held in conjunction with The Conference on Empirical Methods in Natural Language Processing (EMNLP), December 2023

Via

Access Paper or Ask Questions

Empirical study of pretrained multilingual language models for zero-shot cross-lingual generation

Oct 15, 2023

Nadezhda Chirkova, Sheng Liang, Vassilina Nikoulina

Abstract:Zero-shot cross-lingual generation assumes finetuning the multilingual pretrained language model (mPLM) on a generation task in one language and then using it to make predictions for this task in other languages. Previous works notice a frequent problem of generation in a wrong language and propose approaches to address it, usually using mT5 as a backbone model. In this work, we test alternative mPLMs, such as mBART and NLLB, considering full finetuning and parameter-efficient finetuning with adapters. We find that mBART with adapters performs similarly to mT5 of the same size, and NLLB can be competitive in some cases. We also underline the importance of tuning learning rate used for finetuning, which helps to alleviate the problem of generation in the wrong language.

Via

Access Paper or Ask Questions

Cross-Lingual Retrieval Augmented Prompt for Low-Resource Languages

Dec 19, 2022

Ercong Nie, Sheng Liang, Helmut Schmid, Hinrich Schütze

Figure 1 for Cross-Lingual Retrieval Augmented Prompt for Low-Resource Languages

Figure 2 for Cross-Lingual Retrieval Augmented Prompt for Low-Resource Languages

Figure 3 for Cross-Lingual Retrieval Augmented Prompt for Low-Resource Languages

Figure 4 for Cross-Lingual Retrieval Augmented Prompt for Low-Resource Languages

Abstract:Multilingual Pretrained Language Models (MPLMs) have shown their strong multilinguality in recent empirical cross-lingual transfer studies. In this paper, we propose the Prompts Augmented by Retrieval Crosslingually (PARC) pipeline to improve the zero-shot performance on low-resource languages (LRLs) by augmenting the context with semantically similar sentences retrieved from a high-resource language (HRL) as prompts. PARC improves the zero-shot performance on three downstream tasks (binary sentiment classification, topic categorization and natural language inference) with multilingual parallel test sets across 10 LRLs covering 6 language families in both unlabeled settings (+5.1%) and labeled settings (+16.3%). PARC-labeled also outperforms the finetuning baseline by 3.7%. We find a significant positive correlation between cross-lingual transfer performance on one side, and the similarity between the high- and low-resource languages as well as the amount of low-resource pretraining data on the other side. A robustness analysis suggests that PARC has the potential to achieve even stronger performance with more powerful MPLMs.

Via

Access Paper or Ask Questions

Modular and Parameter-Efficient Multimodal Fusion with Prompting

Mar 15, 2022

Sheng Liang, Mengjie Zhao, Hinrich Schütze

Figure 1 for Modular and Parameter-Efficient Multimodal Fusion with Prompting

Figure 2 for Modular and Parameter-Efficient Multimodal Fusion with Prompting

Figure 3 for Modular and Parameter-Efficient Multimodal Fusion with Prompting

Figure 4 for Modular and Parameter-Efficient Multimodal Fusion with Prompting

Abstract:Recent research has made impressive progress in large-scale multimodal pre-training. In the context of the rapid growth of model size, it is necessary to seek efficient and flexible methods other than finetuning. In this paper, we propose to use prompt vectors to align the modalities. Our method achieves comparable performance to several other multimodal fusion methods in low-resource settings. We further show that our method is modular and parameter-efficient for processing tasks involving two or more data modalities.

* Accepted to Findings of ACL 2022

Via

Access Paper or Ask Questions

Locating Language-Specific Information in Contextualized Embeddings

Sep 16, 2021

Sheng Liang, Philipp Dufter, Hinrich Schütze

Figure 1 for Locating Language-Specific Information in Contextualized Embeddings

Figure 2 for Locating Language-Specific Information in Contextualized Embeddings

Figure 3 for Locating Language-Specific Information in Contextualized Embeddings

Figure 4 for Locating Language-Specific Information in Contextualized Embeddings

Abstract:Multilingual pretrained language models (MPLMs) exhibit multilinguality and are well suited for transfer across languages. Most MPLMs are trained in an unsupervised fashion and the relationship between their objective and multilinguality is unclear. More specifically, the question whether MPLM representations are language-agnostic or they simply interleave well with learned task prediction heads arises. In this work, we locate language-specific information in MPLMs and identify its dimensionality and the layers where this information occurs. We show that language-specific information is scattered across many dimensions, which can be projected into a linear subspace. Our study contributes to a better understanding of MPLM representations, going beyond treating them as unanalyzable blobs of information.

Via

Access Paper or Ask Questions