Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Paraphrase Identification with Deep Learning: A Review of Datasets and Methods

Dec 13, 2022

Chao Zhou, Cheng Qiu, Daniel E. Acuna

Share this with someone who'll enjoy it:

Abstract:The rapid advancement of AI technology has made text generation tools like GPT-3 and ChatGPT increasingly accessible, scalable, and effective. This can pose serious threat to the credibility of various forms of media if these technologies are used for plagiarism, including scientific literature and news sources. Despite the development of automated methods for paraphrase identification, detecting this type of plagiarism remains a challenge due to the disparate nature of the datasets on which these methods are trained. In this study, we review traditional and current approaches to paraphrase identification and propose a refined typology of paraphrases. We also investigate how this typology is represented in popular datasets and how under-representation of certain types of paraphrases impacts detection capabilities. Finally, we outline new directions for future research and datasets in the pursuit of more effective paraphrase detection using AI.

* 36 pages, 2 figures, 6 tables, 173 references

View paper on

Share this with someone who'll enjoy it:

Title:Paraphrase Identification with Deep Learning: A Review of Datasets and Methods

Paper and Code