Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Osama Mohanned Afzal

M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection

Feb 17, 2024

Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohanned Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold(+4 more)

Figure 1 for M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection

Figure 2 for M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection

Figure 3 for M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection

Figure 4 for M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection

Abstract:The advent of Large Language Models (LLMs) has brought an unprecedented surge in machine-generated text (MGT) across diverse channels. This raises legitimate concerns about its potential misuse and societal implications. The need to identify and differentiate such content from genuine human-generated text is critical in combating disinformation, preserving the integrity of education and scientific fields, and maintaining trust in communication. In this work, we address this problem by introducing a new benchmark involving multilingual, multi-domain and multi-generator for MGT detection -- M4GT-Bench. It is collected for three task formulations: (1) mono-lingual and multi-lingual binary MGT detection; (2) multi-way detection identifies which particular model generates the text; and (3) human-machine mixed text detection, where a word boundary delimiting MGT from human-written content should be determined. Human evaluation for Task 2 shows less than random guess performance, demonstrating the challenges to distinguish unique LLMs. Promising results always occur when training and test data distribute within the same domain or generators.

* 28 pages

Via

Access Paper or Ask Questions