Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mohammad Gharachorloo

Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization

Dec 21, 2020

Mehrdad Farahani, Mohammad Gharachorloo, Mohammad Manthouri

Figure 1 for Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization

Figure 2 for Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization

Figure 3 for Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization

Figure 4 for Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization

Abstract:Text summarization is one of the most critical Natural Language Processing (NLP) tasks. More and more researches are conducted in this field every day. Pre-trained transformer-based encoder-decoder models have begun to gain popularity for these tasks. This paper proposes two methods to address this task and introduces a novel dataset named pn-summary for Persian abstractive text summarization. The models employed in this paper are mT5 and an encoder-decoder version of the ParsBERT model (i.e., a monolingual BERT model for Persian). These models are fine-tuned on the pn-summary dataset. The current work is the first of its kind and, by achieving promising results, can serve as a baseline for any future work.

* 7 pages, 7 figures, 3 tables, csicc2021 conference

Via

Access Paper or Ask Questions

ParsBERT: Transformer-based Model for Persian Language Understanding

May 31, 2020

Mehrdad Farahani, Mohammad Gharachorloo, Marzieh Farahani, Mohammad Manthouri

Figure 1 for ParsBERT: Transformer-based Model for Persian Language Understanding

Figure 2 for ParsBERT: Transformer-based Model for Persian Language Understanding

Figure 3 for ParsBERT: Transformer-based Model for Persian Language Understanding

Figure 4 for ParsBERT: Transformer-based Model for Persian Language Understanding

Abstract:The surge of pre-trained language models has begun a new era in the field of Natural Language Processing (NLP) by allowing us to build powerful language models. Among these models, Transformer-based models such as BERT have become increasingly popular due to their state-of-the-art performance. However, these models are usually focused on English, leaving other languages to multilingual models with limited resources. This paper proposes a monolingual BERT for the Persian language (ParsBERT), which shows its state-of-the-art performance compared to other architectures and multilingual models. Also, since the amount of data available for NLP tasks in Persian is very restricted, a massive dataset for different NLP tasks as well as pre-training the model is composed. ParsBERT obtains higher scores in all datasets, including existing ones as well as composed ones and improves the state-of-the-art performance by outperforming both multilingual BERT and other prior works in Sentiment Analysis, Text Classification and Named Entity Recognition tasks.

* 10 pages, 5 figures, 7 tables, table 7 corrected and some refs related to table 7

Via

Access Paper or Ask Questions