Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stefano Cresci

Geo-Semantic-Parsing: AI-powered geoparsing by traversing semantic knowledge graphs

Mar 03, 2025

Leonardo Nizzoli, Marco Avvenuti, Maurizio Tesconi, Stefano Cresci

Abstract:Online social networks convey rich information about geospatial facets of reality. However in most cases, geographic information is not explicit and structured, thus preventing its exploitation in real-time applications. We address this limitation by introducing a novel geoparsing and geotagging technique called Geo-Semantic-Parsing (GSP). GSP identifies location references in free text and extracts the corresponding geographic coordinates. To reach this goal, we employ a semantic annotator to identify relevant portions of the input text and to link them to the corresponding entity in a knowledge graph. Then, we devise and experiment with several efficient strategies for traversing the knowledge graph, thus expanding the available set of information for the geoparsing task. Finally, we exploit all available information for learning a regression model that selects the best entity with which to geotag the input text. We evaluate GSP on a well-known reference dataset including almost 10k event-related tweets, achieving $F1=0.66$. We extensively compare our results with those of 2 baselines and 3 state-of-the-art geoparsing techniques, achieving the best performance. On the same dataset, competitors obtain $F1 \leq 0.55$. We conclude by providing in-depth analyses of our results, showing that the overall superior performance of GSP is mainly due to a large improvement in recall, with respect to existing techniques.

* Decision Support Systems 136:113346, 2020
* Postprint of the article published in the Decision Support Systems journal. Please, cite accordingly

Via

Access Paper or Ask Questions

Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation

Dec 10, 2024

Lorenzo Cima, Alessio Miaschi, Amaury Trujillo, Marco Avvenuti, Felice Dell'Orletta, Stefano Cresci

Abstract:AI-generated counterspeech offers a promising and scalable strategy to curb online toxicity through direct replies that promote civil discourse. However, current counterspeech is one-size-fits-all, lacking adaptation to the moderation context and the users involved. We propose and evaluate multiple strategies for generating tailored counterspeech that is adapted to the moderation context and personalized for the moderated user. We instruct an LLaMA2-13B model to generate counterspeech, experimenting with various configurations based on different contextual information and fine-tuning strategies. We identify the configurations that generate persuasive counterspeech through a combination of quantitative indicators and human evaluations collected via a pre-registered mixed-design crowdsourcing experiment. Results show that contextualized counterspeech can significantly outperform state-of-the-art generic counterspeech in adequacy and persuasiveness, without compromising other characteristics. Our findings also reveal a poor correlation between quantitative indicators and human evaluations, suggesting that these methods assess different aspects and highlighting the need for nuanced evaluation methodologies. The effectiveness of contextualized AI-generated counterspeech and the divergence between human and algorithmic evaluations underscore the importance of increased human-AI collaboration in content moderation.

Via

Access Paper or Ask Questions

Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets

Oct 10, 2024

Tommaso Giorgi, Lorenzo Cima, Tiziano Fagni, Marco Avvenuti, Stefano Cresci

Figure 1 for Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets

Figure 2 for Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets

Figure 3 for Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets

Figure 4 for Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets

Abstract:The rise of online platforms exacerbated the spread of hate speech, demanding scalable and effective detection. However, the accuracy of hate speech detection systems heavily relies on human-labeled data, which is inherently susceptible to biases. While previous work has examined the issue, the interplay between the characteristics of the annotator and those of the target of the hate are still unexplored. We fill this gap by leveraging an extensive dataset with rich socio-demographic information of both annotators and targets, uncovering how human biases manifest in relation to the target's attributes. Our analysis surfaces the presence of widespread biases, which we quantitatively describe and characterize based on their intensity and prevalence, revealing marked differences. Furthermore, we compare human biases with those exhibited by persona-based LLMs. Our findings indicate that while persona-based LLMs do exhibit biases, these differ significantly from those of human annotators. Overall, our work offers new and nuanced results on human biases in hate speech annotations, as well as fresh insights into the design of AI-driven hate speech detection systems.

Via

Access Paper or Ask Questions

Detection and Characterization of Coordinated Online Behavior: A Survey

Aug 02, 2024

Lorenzo Mannocci, Michele Mazza, Anna Monreale, Maurizio Tesconi, Stefano Cresci

Figure 1 for Detection and Characterization of Coordinated Online Behavior: A Survey

Figure 2 for Detection and Characterization of Coordinated Online Behavior: A Survey

Figure 3 for Detection and Characterization of Coordinated Online Behavior: A Survey

Figure 4 for Detection and Characterization of Coordinated Online Behavior: A Survey

Abstract:Coordination is a fundamental aspect of life. The advent of social media has made it integral also to online human interactions, such as those that characterize thriving online communities and social movements. At the same time, coordination is also core to effective disinformation, manipulation, and hate campaigns. This survey collects, categorizes, and critically discusses the body of work produced as a result of the growing interest on coordinated online behavior. We reconcile industry and academic definitions, propose a comprehensive framework to study coordinated online behavior, and review and critically discuss the existing detection and characterization methods. Our analysis identifies open challenges and promising directions of research, serving as a guide for scholars, practitioners, and policymakers in understanding and addressing the complexities inherent to online coordination.

Via

Access Paper or Ask Questions

The DSA Transparency Database: Auditing Self-reported Moderation Actions by Social Media

Dec 16, 2023

Amaury Trujillo, Tiziano Fagni, Stefano Cresci

Abstract:Since September 2023, the Digital Services Act (DSA) obliges large online platforms to submit detailed data on each moderation action they take within the European Union (EU) to the DSA Transparency Database. From its inception, this centralized database has sparked scholarly interest as an unprecedented and potentially unique trove of data on real-world online moderation. Here, we thoroughly analyze all 195.61M records submitted by the eight largest social media platforms in the EU during the first 60 days of the database. Specifically, we conduct a platform-wise comparative study of their: volume of moderation actions, grounds for decision, types of applied restrictions, types of moderated content, timeliness in undertaking and submitting moderation actions, and use of automation. Furthermore, we systematically cross-check the contents of the database with the platforms' own transparency reports. Our analyses reveal that (i) the platforms adhered only in part to the philosophy and structure of the database, (ii) the structure of the database is partially inadequate for the platforms' reporting needs, (iii) the platforms exhibited substantial differences in their moderation actions, (iv) a remarkable fraction of the database data is inconsistent, (v) the platform X (formerly Twitter) presents the most inconsistencies. Our findings have far-reaching implications for policymakers and scholars across diverse disciplines. They offer guidance for future regulations that cater to the reporting needs of online platforms in general, but also highlight opportunities to improve and refine the database itself.

Via

Access Paper or Ask Questions

Demystifying Misconceptions in Social Bots Research

Mar 30, 2023

Stefano Cresci, Roberto Di Pietro, Angelo Spognardi, Maurizio Tesconi, Marinella Petrocchi

Abstract:The science of social bots seeks knowledge and solutions to one of the most debated forms of online misinformation. Yet, social bots research is plagued by widespread biases, hyped results, and misconceptions that set the stage for ambiguities, unrealistic expectations, and seemingly irreconcilable findings. Overcoming such issues is instrumental towards ensuring reliable solutions and reaffirming the validity of the scientific method. In this contribution we revise some recent results in social bots research, highlighting and correcting factual errors as well as methodological and conceptual issues. More importantly, we demystify common misconceptions, addressing fundamental points on how social bots research is discussed. Our analysis surfaces the need to discuss misinformation research in a rigorous, unbiased, and responsible way. This article bolsters such effort by identifying and refuting common fallacious arguments used by both proponents and opponents of social bots research as well as providing indications on the correct methodologies and sound directions for future research in the field.

Via

Access Paper or Ask Questions

Temporal Dynamics of Coordinated Online Behavior: Stability, Archetypes, and Influence

Jan 17, 2023

Serena Tardelli, Leonardo Nizzoli, Maurizio Tesconi, Mauro Conti, Preslav Nakov, Giovanni Da San Martino, Stefano Cresci

Figure 1 for Temporal Dynamics of Coordinated Online Behavior: Stability, Archetypes, and Influence

Figure 2 for Temporal Dynamics of Coordinated Online Behavior: Stability, Archetypes, and Influence

Figure 3 for Temporal Dynamics of Coordinated Online Behavior: Stability, Archetypes, and Influence

Figure 4 for Temporal Dynamics of Coordinated Online Behavior: Stability, Archetypes, and Influence

Abstract:Large-scale online campaigns, malicious or otherwise, require a significant degree of coordination among participants, which sparked interest in the study of coordinated online behavior. State-of-the-art methods for detecting coordinated behavior perform static analyses, disregarding the temporal dynamics of coordination. Here, we carry out the first dynamic analysis of coordinated behavior. To reach our goal we build a multiplex temporal network and we perform dynamic community detection to identify groups of users that exhibited coordinated behaviors in time. Thanks to our novel approach we find that: (i) coordinated communities feature variable degrees of temporal instability; (ii) dynamic analyses are needed to account for such instability, and results of static analyses can be unreliable and scarcely representative of unstable communities; (iii) some users exhibit distinct archetypal behaviors that have important practical implications; (iv) content and network characteristics contribute to explaining why users leave and join coordinated communities. Our results demonstrate the advantages of dynamic analyses and open up new directions of research on the unfolding of online debates, on the strategies of coordinated communities, and on the patterns of online influence.

Via

Access Paper or Ask Questions

MulBot: Unsupervised Bot Detection Based on Multivariate Time Series

Sep 21, 2022

Lorenzo Mannocci, Stefano Cresci, Anna Monreale, Athina Vakali, Maurizio Tesconi

Figure 1 for MulBot: Unsupervised Bot Detection Based on Multivariate Time Series

Figure 2 for MulBot: Unsupervised Bot Detection Based on Multivariate Time Series

Figure 3 for MulBot: Unsupervised Bot Detection Based on Multivariate Time Series

Figure 4 for MulBot: Unsupervised Bot Detection Based on Multivariate Time Series

Abstract:Online social networks are actively involved in the removal of malicious social bots due to their role in the spread of low quality information. However, most of the existing bot detectors are supervised classifiers incapable of capturing the evolving behavior of sophisticated bots. Here we propose MulBot, an unsupervised bot detector based on multivariate time series (MTS). For the first time, we exploit multidimensional temporal features extracted from user timelines. We manage the multidimensionality with an LSTM autoencoder, which projects the MTS in a suitable latent space. Then, we perform a clustering step on this encoded representation to identify dense groups of very similar users -- a known sign of automation. Finally, we perform a binary classification task achieving f1-score $= 0.99$, outperforming state-of-the-art methods (f1-score $\le 0.97$). Not only does MulBot achieve excellent results in the binary classification task, but we also demonstrate its strengths in a novel and practically-relevant task: detecting and separating different botnets. In this multi-class classification task we achieve f1-score $= 0.96$. We conclude by estimating the importance of the different features used in our model and by evaluating MulBot's capability to generalize to new unseen bots, thus proposing a solution to the generalization deficiencies of supervised bot detectors.

Via

Access Paper or Ask Questions

Personalized Interventions for Online Moderation

May 19, 2022

Stefano Cresci, Amaury Trujillo, Tiziano Fagni

Figure 1 for Personalized Interventions for Online Moderation

Figure 2 for Personalized Interventions for Online Moderation

Abstract:Current online moderation follows a one-size-fits-all approach, where each intervention is applied in the same way to all users. This naive approach is challenged by established socio-behavioral theories and by recent empirical results that showed the limited effectiveness of such interventions. We propose a paradigm-shift in online moderation by moving towards a personalized and user-centered approach. Our multidisciplinary vision combines state-of-the-art theories and practices in diverse fields such as computer science, sociology and psychology, to design personalized moderation interventions (PMIs). In outlining the path leading to the next-generation of moderation interventions, we also discuss the most prominent challenges introduced by such a disruptive change.

* The 33rd ACM Conference on Hypertext and Social Media (HT '22)

Via

Access Paper or Ask Questions

Fine-Grained Prediction of Political Leaning on Social Media with Unsupervised Deep Learning

Feb 23, 2022

Tiziano Fagni, Stefano Cresci

Figure 1 for Fine-Grained Prediction of Political Leaning on Social Media with Unsupervised Deep Learning

Figure 2 for Fine-Grained Prediction of Political Leaning on Social Media with Unsupervised Deep Learning

Figure 3 for Fine-Grained Prediction of Political Leaning on Social Media with Unsupervised Deep Learning

Figure 4 for Fine-Grained Prediction of Political Leaning on Social Media with Unsupervised Deep Learning

Abstract:Predicting the political leaning of social media users is an increasingly popular task, given its usefulness for electoral forecasts, opinion dynamics models and for studying the political dimension of polarization and disinformation. Here, we propose a novel unsupervised technique for learning fine-grained political leaning from the textual content of social media posts. Our technique leverages a deep neural network for learning latent political ideologies in a representation learning task. Then, users are projected in a low-dimensional ideology space where they are subsequently clustered. The political leaning of a user is automatically derived from the cluster to which the user is assigned. We evaluated our technique in two challenging classification tasks and we compared it to baselines and other state-of-the-art approaches. Our technique obtains the best results among all unsupervised techniques, with micro F1 = 0.426 in the 8-class task and micro F1 = 0.772 in the 3-class task. Other than being interesting on their own, our results also pave the way for the development of new and better unsupervised approaches for the detection of fine-grained political leaning.

* Journal of Artificial Intelligence Research 73:633-672, 2022

Via

Access Paper or Ask Questions