Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Juan Junqueras

FC-CONAN: An Exhaustively Paired Dataset for Robust Evaluation of Retrieval Systems

Jan 04, 2026

Juan Junqueras, Florian Boudin, May-Myo Zin, Ha-Thanh Nguyen, Wachara Fungwacharakorn, Damián Ariel Furman, Akiko Aizawa, Ken Satoh

Abstract:Hate speech (HS) is a critical issue in online discourse, and one promising strategy to counter it is through the use of counter-narratives (CNs). Datasets linking HS with CNs are essential for advancing counterspeech research. However, even flagship resources like CONAN (Chung et al., 2019) annotate only a sparse subset of all possible HS-CN pairs, limiting evaluation. We introduce FC-CONAN (Fully Connected CONAN), the first dataset created by exhaustively considering all combinations of 45 English HS messages and 129 CNs. A two-stage annotation process involving nine annotators and four validators produces four partitions-Diamond, Gold, Silver, and Bronze-that balance reliability and scale. None of the labeled pairs overlap with CONAN, uncovering hundreds of previously unlabelled positives. FC-CONAN enables more faithful evaluation of counterspeech retrieval systems and facilitates detailed error analysis. The dataset is publicly available.

* Presented at NeLaMKRR@KR, 2025 (arXiv:2511.09575)

Via

Access Paper or Ask Questions

Mining Reasons For And Against Vaccination From Unstructured Data Using Nichesourcing and AI Data Augmentation

Jun 28, 2024

Damián Ariel Furman, Juan Junqueras, Z. Burçe Gümüslü, Edgar Altszyler, Joaquin Navajas, Ophelia Deroy, Justin Sulik

Figure 1 for Mining Reasons For And Against Vaccination From Unstructured Data Using Nichesourcing and AI Data Augmentation

Figure 2 for Mining Reasons For And Against Vaccination From Unstructured Data Using Nichesourcing and AI Data Augmentation

Figure 3 for Mining Reasons For And Against Vaccination From Unstructured Data Using Nichesourcing and AI Data Augmentation

Figure 4 for Mining Reasons For And Against Vaccination From Unstructured Data Using Nichesourcing and AI Data Augmentation

Abstract:We present Reasons For and Against Vaccination (RFAV), a dataset for predicting reasons for and against vaccination, and scientific authorities used to justify them, annotated through nichesourcing and augmented using GPT4 and GPT3.5-Turbo. We show how it is possible to mine these reasons in non-structured text, under different task definitions, despite the high level of subjectivity involved and explore the impact of artificially augmented data using in-context learning with GPT4 and GPT3.5-Turbo. We publish the dataset and the trained models along with the annotation manual used to train annotators and define the task.

* 8 pages + references and appendix

Via

Access Paper or Ask Questions