Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Soprano

In Crowd Veritas: Leveraging Human Intelligence To Fight Misinformation

Jun 10, 2025

Michael Soprano

Abstract:The spread of online misinformation poses serious threats to democratic societies. Traditionally, expert fact-checkers verify the truthfulness of information through investigative processes. However, the volume and immediacy of online content present major scalability challenges. Crowdsourcing offers a promising alternative by leveraging non-expert judgments, but it introduces concerns about bias, accuracy, and interpretability. This thesis investigates how human intelligence can be harnessed to assess the truthfulness of online information, focusing on three areas: misinformation assessment, cognitive biases, and automated fact-checking systems. Through large-scale crowdsourcing experiments and statistical modeling, it identifies key factors influencing human judgments and introduces a model for the joint prediction and explanation of truthfulness. The findings show that non-expert judgments often align with expert assessments, particularly when factors such as timing and experience are considered. By deepening our understanding of human judgment and bias in truthfulness assessment, this thesis contributes to the development of more transparent, trustworthy, and interpretable systems for combating misinformation.

* PhD thesis, University of Udine, defended May 2023, 458 pages

Via

Access Paper or Ask Questions

Collecting Cost-Effective, High-Quality Truthfulness Assessments with LLM Summarized Evidence

Jan 30, 2025

Kevin Roitero, Dustin Wright, Michael Soprano, Isabelle Augenstein, Stefano Mizzaro

Figure 1 for Collecting Cost-Effective, High-Quality Truthfulness Assessments with LLM Summarized Evidence

Figure 2 for Collecting Cost-Effective, High-Quality Truthfulness Assessments with LLM Summarized Evidence

Figure 3 for Collecting Cost-Effective, High-Quality Truthfulness Assessments with LLM Summarized Evidence

Figure 4 for Collecting Cost-Effective, High-Quality Truthfulness Assessments with LLM Summarized Evidence

Abstract:With the degradation of guardrails against mis- and disinformation online, it is more critical than ever to be able to effectively combat it. In this paper, we explore the efficiency and effectiveness of using crowd-sourced truthfulness assessments based on condensed, large language model (LLM) generated summaries of online sources. We compare the use of generated summaries to the use of original web pages in an A/B testing setting, where we employ a large and diverse pool of crowd-workers to perform the truthfulness assessment. We evaluate the quality of assessments, the efficiency with which assessments are performed, and the behavior and engagement of participants. Our results demonstrate that the Summary modality, which relies on summarized evidence, offers no significant change in assessment accuracy over the Standard modality, while significantly increasing the speed with which assessments are performed. Workers using summarized evidence produce a significantly higher number of assessments in the same time frame, reducing the cost needed to acquire truthfulness assessments. Additionally, the Summary modality maximizes both the inter-annotator agreements as well as the reliance on and perceived usefulness of evidence, demonstrating the utility of summarized evidence without sacrificing the quality of assessments.

* 18 pages; 7 figures; 5 tables

Via

Access Paper or Ask Questions

The Many Dimensions of Truthfulness: Crowdsourcing Misinformation Assessments on a Multidimensional Scale

Aug 23, 2021

Michael Soprano, Kevin Roitero, David La Barbera, Davide Ceolin, Damiano Spina, Stefano Mizzaro, Gianluca Demartini

Figure 1 for The Many Dimensions of Truthfulness: Crowdsourcing Misinformation Assessments on a Multidimensional Scale

Figure 2 for The Many Dimensions of Truthfulness: Crowdsourcing Misinformation Assessments on a Multidimensional Scale

Figure 3 for The Many Dimensions of Truthfulness: Crowdsourcing Misinformation Assessments on a Multidimensional Scale

Figure 4 for The Many Dimensions of Truthfulness: Crowdsourcing Misinformation Assessments on a Multidimensional Scale

Abstract:Recent work has demonstrated the viability of using crowdsourcing as a tool for evaluating the truthfulness of public statements. Under certain conditions such as: (1) having a balanced set of workers with different backgrounds and cognitive abilities; (2) using an adequate set of mechanisms to control the quality of the collected data; and (3) using a coarse grained assessment scale, the crowd can provide reliable identification of fake news. However, fake news are a subtle matter: statements can be just biased ("cherrypicked"), imprecise, wrong, etc. and the unidimensional truth scale used in existing work cannot account for such differences. In this paper we propose a multidimensional notion of truthfulness and we ask the crowd workers to assess seven different dimensions of truthfulness selected based on existing literature: Correctness, Neutrality, Comprehensibility, Precision, Completeness, Speaker's Trustworthiness, and Informativeness. We deploy a set of quality control mechanisms to ensure that the thousands of assessments collected on 180 publicly available fact-checked statements distributed over two datasets are of adequate quality, including a custom search engine used by the crowd workers to find web pages supporting their truthfulness assessments. A comprehensive analysis of crowdsourced judgments shows that: (1) the crowdsourced assessments are reliable when compared to an expert-provided gold standard; (2) the proposed dimensions of truthfulness capture independent pieces of information; (3) the crowdsourcing task can be easily learned by the workers; and (4) the resulting assessments provide a useful basis for a more complete estimation of statement truthfulness.

* Information Processing & Management Information Processing & Management, Volume 58, Issue 6, November 2021, 102710
* 33 pages; Paper accepted at Information Processing & Management on July 28, 2021; IP&M Special Issue on Dis/Misinformation Mining from Social Media

Via

Access Paper or Ask Questions

Can the Crowd Judge Truthfulness? A Longitudinal Study on Recent Misinformation about COVID-19

Jul 25, 2021

Kevin Roitero, Michael Soprano, Beatrice Portelli, Massimiliano De Luise, Damiano Spina, Vincenzo Della Mea, Giuseppe Serra, Stefano Mizzaro, Gianluca Demartini

Figure 1 for Can the Crowd Judge Truthfulness? A Longitudinal Study on Recent Misinformation about COVID-19

Figure 2 for Can the Crowd Judge Truthfulness? A Longitudinal Study on Recent Misinformation about COVID-19

Figure 3 for Can the Crowd Judge Truthfulness? A Longitudinal Study on Recent Misinformation about COVID-19

Figure 4 for Can the Crowd Judge Truthfulness? A Longitudinal Study on Recent Misinformation about COVID-19

Abstract:Recently, the misinformation problem has been addressed with a crowdsourcing-based approach: to assess the truthfulness of a statement, instead of relying on a few experts, a crowd of non-expert is exploited. We study whether crowdsourcing is an effective and reliable method to assess truthfulness during a pandemic, targeting statements related to COVID-19, thus addressing (mis)information that is both related to a sensitive and personal issue and very recent as compared to when the judgment is done. In our experiments, crowd workers are asked to assess the truthfulness of statements, and to provide evidence for the assessments. Besides showing that the crowd is able to accurately judge the truthfulness of the statements, we report results on workers behavior, agreement among workers, effect of aggregation functions, of scales transformations, and of workers background and bias. We perform a longitudinal study by re-launching the task multiple times with both novice and experienced workers, deriving important insights on how the behavior and quality change over time. Our results show that: workers are able to detect and objectively categorize online (mis)information related to COVID-19; both crowdsourced and expert judgments can be transformed and aggregated to improve quality; worker background and other signals (e.g., source of information, behavior) impact the quality of the data. The longitudinal study demonstrates that the time-span has a major effect on the quality of the judgments, for both novice and experienced workers. Finally, we provide an extensive failure analysis of the statements misjudged by the crowd-workers.

* 31 pages; Preprint of an article accepted in Personal and Ubiquitous Computing (Special Issue on Intelligent Systems for Tackling Online Harms). arXiv admin note: substantial text overlap with arXiv:2008.05701

Via

Access Paper or Ask Questions

The COVID-19 Infodemic: Can the Crowd Judge Recent Misinformation Objectively?

Aug 13, 2020

Kevin Roitero, Michael Soprano, Beatrice Portelli, Damiano Spina, Vincenzo Della Mea, Giuseppe Serra, Stefano Mizzaro, Gianluca Demartini

Figure 1 for The COVID-19 Infodemic: Can the Crowd Judge Recent Misinformation Objectively?

Figure 2 for The COVID-19 Infodemic: Can the Crowd Judge Recent Misinformation Objectively?

Figure 3 for The COVID-19 Infodemic: Can the Crowd Judge Recent Misinformation Objectively?

Figure 4 for The COVID-19 Infodemic: Can the Crowd Judge Recent Misinformation Objectively?

Abstract:Misinformation is an ever increasing problem that is difficult to solve for the research community and has a negative impact on the society at large. Very recently, the problem has been addressed with a crowdsourcing-based approach to scale up labeling efforts: to assess the truthfulness of a statement, instead of relying on a few experts, a crowd of (non-expert) judges is exploited. We follow the same approach to study whether crowdsourcing is an effective and reliable method to assess statements truthfulness during a pandemic. We specifically target statements related to the COVID-19 health emergency, that is still ongoing at the time of the study and has arguably caused an increase of the amount of misinformation that is spreading online (a phenomenon for which the term "infodemic" has been used). By doing so, we are able to address (mis)information that is both related to a sensitive and personal issue like health and very recent as compared to when the judgment is done: two issues that have not been analyzed in related work. In our experiment, crowd workers are asked to assess the truthfulness of statements, as well as to provide evidence for the assessments as a URL and a text justification. Besides showing that the crowd is able to accurately judge the truthfulness of the statements, we also report results on many different aspects, including: agreement among workers, the effect of different aggregation functions, of scales transformations, and of workers background / bias. We also analyze workers behavior, in terms of queries submitted, URLs found / selected, text justifications, and other behavioral data like clicks and mouse actions collected by means of an ad hoc logger.

* 10 pages; Preprint of the full paper accepted at CIKM 2020

Via

Access Paper or Ask Questions