Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Christophe Servan

STL, ILES

Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification

Apr 17, 2024

Pierre Lepagnol, Thomas Gerald, Sahar Ghannay, Christophe Servan, Sophie Rosset

Figure 1 for Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification

Figure 2 for Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification

Figure 3 for Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification

Figure 4 for Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification

Abstract:This study is part of the debate on the efficiency of large versus small language models for text classification by prompting.We assess the performance of small language models in zero-shot text classification, challenging the prevailing dominance of large models.Across 15 datasets, our investigation benchmarks language models from 77M to 40B parameters using different architectures and scoring functions. Our findings reveal that small models can effectively classify texts, getting on par with or surpassing their larger counterparts.We developed and shared a comprehensive open-source repository that encapsulates our methodologies. This research underscores the notion that bigger isn't always better, suggesting that resource-efficient small models may offer viable solutions for specific data classification challenges.

* LREC-COLING 2024, May 2024, TURIN, Italy

Via

Access Paper or Ask Questions

New Semantic Task for the French Spoken Language Understanding MEDIA Benchmark

Mar 28, 2024

Nadège Alavoine, Gaëlle Laperriere, Christophe Servan, Sahar Ghannay, Sophie Rosset

Abstract:Intent classification and slot-filling are essential tasks of Spoken Language Understanding (SLU). In most SLUsystems, those tasks are realized by independent modules. For about fifteen years, models achieving both of themjointly and exploiting their mutual enhancement have been proposed. A multilingual module using a joint modelwas envisioned to create a touristic dialogue system for a European project, HumanE-AI-Net. A combination ofmultiple datasets, including the MEDIA dataset, was suggested for training this joint model. The MEDIA SLU datasetis a French dataset distributed since 2005 by ELRA, mainly used by the French research community and free foracademic research since 2020. Unfortunately, it is annotated only in slots but not intents. An enhanced version ofMEDIA annotated with intents has been built to extend its use to more tasks and use cases. This paper presents thesemi-automatic methodology used to obtain this enhanced version. In addition, we present the first results of SLUexperiments on this enhanced dataset using joint models for intent classification and slot-filling.

* The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), May 2024, Torino, Italy

Via

Access Paper or Ask Questions

A Benchmark Evaluation of Clinical Named Entity Recognition in French

Mar 28, 2024

Nesrine Bannour, Christophe Servan, Aurélie Névéol, Xavier Tannier

Figure 1 for A Benchmark Evaluation of Clinical Named Entity Recognition in French

Figure 2 for A Benchmark Evaluation of Clinical Named Entity Recognition in French

Figure 3 for A Benchmark Evaluation of Clinical Named Entity Recognition in French

Figure 4 for A Benchmark Evaluation of Clinical Named Entity Recognition in French

Abstract:Background: Transformer-based language models have shown strong performance on many Natural LanguageProcessing (NLP) tasks. Masked Language Models (MLMs) attract sustained interest because they can be adaptedto different languages and sub-domains through training or fine-tuning on specific corpora while remaining lighterthan modern Large Language Models (LLMs). Recently, several MLMs have been released for the biomedicaldomain in French, and experiments suggest that they outperform standard French counterparts. However, nosystematic evaluation comparing all models on the same corpora is available. Objective: This paper presentsan evaluation of masked language models for biomedical French on the task of clinical named entity recognition.Material and methods: We evaluate biomedical models CamemBERT-bio and DrBERT and compare them tostandard French models CamemBERT, FlauBERT and FrALBERT as well as multilingual mBERT using three publicallyavailable corpora for clinical named entity recognition in French. The evaluation set-up relies on gold-standardcorpora as released by the corpus developers. Results: Results suggest that CamemBERT-bio outperformsDrBERT consistently while FlauBERT offers competitive performance and FrAlBERT achieves the lowest carbonfootprint. Conclusion: This is the first benchmark evaluation of biomedical masked language models for Frenchclinical entity recognition that compares model performance consistently on nested entity recognition using metricscovering performance and environmental impact.

* The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), May 2024, Torino, Italy

Via

Access Paper or Ask Questions

mALBERT: Is a Compact Multilingual BERT Model Still Worth It?

Mar 27, 2024

Christophe Servan, Sahar Ghannay, Sophie Rosset

Figure 1 for mALBERT: Is a Compact Multilingual BERT Model Still Worth It?

Figure 2 for mALBERT: Is a Compact Multilingual BERT Model Still Worth It?

Figure 3 for mALBERT: Is a Compact Multilingual BERT Model Still Worth It?

Figure 4 for mALBERT: Is a Compact Multilingual BERT Model Still Worth It?

Abstract:Within the current trend of Pretained Language Models (PLM), emerge more and more criticisms about the ethical andecological impact of such models. In this article, considering these critical remarks, we propose to focus on smallermodels, such as compact models like ALBERT, which are more ecologically virtuous than these PLM. However,PLMs enable huge breakthroughs in Natural Language Processing tasks, such as Spoken and Natural LanguageUnderstanding, classification, Question--Answering tasks. PLMs also have the advantage of being multilingual, and,as far as we know, a multilingual version of compact ALBERT models does not exist. Considering these facts, wepropose the free release of the first version of a multilingual compact ALBERT model, pre-trained using Wikipediadata, which complies with the ethical aspect of such a language model. We also evaluate the model against classicalmultilingual PLMs in classical NLP tasks. Finally, this paper proposes a rare study on the subword tokenizationimpact on language performances.

* The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, May 2024, Torino, Italy

Via

Access Paper or Ask Questions

On the cross-lingual transferability of multilingual prototypical models across NLU tasks

Jul 19, 2022

Oralie Cattan, Christophe Servan, Sophie Rosset

Figure 1 for On the cross-lingual transferability of multilingual prototypical models across NLU tasks

Figure 2 for On the cross-lingual transferability of multilingual prototypical models across NLU tasks

Figure 3 for On the cross-lingual transferability of multilingual prototypical models across NLU tasks

Abstract:Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven to be effective for limited domain and language applications when a sufficient number of training examples are available. In practice, these approaches suffer from the drawbacks of domain-driven design and under-resourced languages. Domain and language models are supposed to grow and change as the problem space evolves. On one hand, research on transfer learning has demonstrated the cross-lingual ability of multilingual Transformers-based models to learn semantically rich representations. On the other, in addition to the above approaches, meta-learning have enabled the development of task and language learning algorithms capable of far generalization. Through this context, this article proposes to investigate the cross-lingual transferability of using synergistically few-shot learning with prototypical neural networks and multilingual Transformers-based models. Experiments in natural language understanding tasks on MultiATIS++ corpus shows that our approach substantially improves the observed transfer learning performances between the low and the high resource languages. More generally our approach confirms that the meaningful latent space learned in a given language can be can be generalized to unseen and under-resourced ones using meta-learning.

* Accepted to the ACL workshop METANLP 2021

Via

Access Paper or Ask Questions

Benchmarking Transformers-based models on French Spoken Language Understanding tasks

Jul 19, 2022

Oralie Cattan, Sahar Ghannay, Christophe Servan, Sophie Rosset

Figure 1 for Benchmarking Transformers-based models on French Spoken Language Understanding tasks

Figure 2 for Benchmarking Transformers-based models on French Spoken Language Understanding tasks

Figure 3 for Benchmarking Transformers-based models on French Spoken Language Understanding tasks

Abstract:In the last five years, the rise of the self-attentional Transformer-based architectures led to state-of-the-art performances over many natural language tasks. Although these approaches are increasingly popular, they require large amounts of data and computational resources. There is still a substantial need for benchmarking methodologies ever upwards on under-resourced languages in data-scarce application conditions. Most pre-trained language models were massively studied using the English language and only a few of them were evaluated on French. In this paper, we propose a unified benchmark, focused on evaluating models quality and their ecological impact on two well-known French spoken language understanding tasks. Especially we benchmark thirteen well-established Transformer-based models on the two available spoken language understanding tasks for French: MEDIA and ATIS-FR. Within this framework, we show that compact models can reach comparable results to bigger ones while their ecological impact is considerably lower. However, this assumption is nuanced and depends on the considered compression method.

* Accepted paper at INTERSPEECH 2022

Via

Access Paper or Ask Questions

On the Usability of Transformers-based models for a French Question-Answering task

Jul 19, 2022

Oralie Cattan, Christophe Servan, Sophie Rosset

Figure 1 for On the Usability of Transformers-based models for a French Question-Answering task

Figure 2 for On the Usability of Transformers-based models for a French Question-Answering task

Figure 3 for On the Usability of Transformers-based models for a French Question-Answering task

Figure 4 for On the Usability of Transformers-based models for a French Question-Answering task

Abstract:For many tasks, state-of-the-art results have been achieved with Transformer-based architectures, resulting in a paradigmatic shift in practices from the use of task-specific architectures to the fine-tuning of pre-trained language models. The ongoing trend consists in training models with an ever-increasing amount of data and parameters, which requires considerable resources. It leads to a strong search to improve resource efficiency based on algorithmic and hardware improvements evaluated only for English. This raises questions about their usability when applied to small-scale learning problems, for which a limited amount of training data is available, especially for under-resourced languages tasks. The lack of appropriately sized corpora is a hindrance to applying data-driven and transfer learning-based approaches with strong instability cases. In this paper, we establish a state-of-the-art of the efforts dedicated to the usability of Transformer-based models and propose to evaluate these improvements on the question-answering performances of French language which have few resources. We address the instability relating to data scarcity by investigating various training strategies with data augmentation, hyperparameters optimization and cross-lingual transfer. We also introduce a new compact model for French FrALBERT which proves to be competitive in low-resource settings.

* French compact model paper: FrALBERT, Accepted to RANLP 2021

Via

Access Paper or Ask Questions

Using Whole Document Context in Neural Machine Translation

Oct 16, 2019

Valentin Macé, Christophe Servan

Figure 1 for Using Whole Document Context in Neural Machine Translation

Figure 2 for Using Whole Document Context in Neural Machine Translation

Figure 3 for Using Whole Document Context in Neural Machine Translation

Figure 4 for Using Whole Document Context in Neural Machine Translation

Abstract:In Machine Translation, considering the document as a whole can help to resolve ambiguities and inconsistencies. In this paper, we propose a simple yet promising approach to add contextual information in Neural Machine Translation. We present a method to add source context that capture the whole document with accurate boundaries, taking every word into account. We provide this additional information to a Transformer model and study the impact of our method on three language pairs. The proposed approach obtains promising results in the English-German, English-French and French-English document-level translation tasks. We observe interesting cross-sentential behaviors where the model learns to use document-level information to improve translation coherence.

* Accepted paper to IWSLT2019

Via

Access Paper or Ask Questions

Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases

Jul 06, 2019

Estelle Maudet, Oralie Cattan, Maureen de Seyssel, Christophe Servan

Figure 1 for Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases

Figure 2 for Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases

Figure 3 for Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases

Figure 4 for Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases

Abstract:This paper reports on Qwant Research contribution to tasks 2 and 3 of the DEFT 2019's challenge, focusing on French clinical cases analysis. Task 2 is a task on semantic similarity between clinical cases and discussions. For this task, we propose an approach based on language models and evaluate the impact on the results of different preprocessings and matching techniques. For task 3, we have developed an information extraction system yielding very encouraging results accuracy-wise. We have experimented two different approaches, one based on the exclusive use of neural networks, the other based on a linguistic analysis.

* DEFT 2019
* Article accepted at the workshop DEfi fouille de Texte (DEFT 2019). Article in French

Via

Access Paper or Ask Questions

Image search using multilingual texts: a cross-modal learning approach between image and text

May 14, 2019

Maxime Portaz, Hicham Randrianarivo, Adrien Nivaggioli, Estelle Maudet, Christophe Servan, Sylvain Peyronnet

Figure 1 for Image search using multilingual texts: a cross-modal learning approach between image and text

Figure 2 for Image search using multilingual texts: a cross-modal learning approach between image and text

Figure 3 for Image search using multilingual texts: a cross-modal learning approach between image and text

Figure 4 for Image search using multilingual texts: a cross-modal learning approach between image and text

Abstract:Multilingual (or cross-lingual) embeddings represent several languages in a unique vector space. Using a common embedding space enables for a shared semantic between words from different languages. In this paper, we propose to embed images and texts into a unique distributional vector space, enabling to search images by using text queries expressing information needs related to the (visual) content of images, as well as using image similarity. Our framework forces the representation of an image to be similar to the representation of the text that describes it. Moreover, by using multilingual embeddings we ensure that words from two different languages have close descriptors and thus are attached to similar images. We provide experimental evidence of the efficiency of our approach by experimenting it on two datasets: Common Objects in COntext (COCO) [19] and Multi30K [7].

Via

Access Paper or Ask Questions