Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ahmad Diab

Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections

Mar 29, 2024

Ahmad Diab, Rr. Nefriana, Yu-Ru Lin

Figure 1 for Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections

Figure 2 for Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections

Figure 3 for Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections

Figure 4 for Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections

Abstract:Online discussions frequently involve conspiracy theories, which can contribute to the proliferation of belief in them. However, not all discussions surrounding conspiracy theories promote them, as some are intended to debunk them. Existing research has relied on simple proxies or focused on a constrained set of signals to identify conspiracy theories, which limits our understanding of conspiratorial discussions across different topics and online communities. This work establishes a general scheme for classifying discussions related to conspiracy theories based on authors' perspectives on the conspiracy belief, which can be expressed explicitly through narrative elements, such as the agent, action, or objective, or implicitly through references to known theories, such as chemtrails or the New World Order. We leverage human-labeled ground truth to train a BERT-based model for classifying online CTs, which we then compared to the Generative Pre-trained Transformer machine (GPT) for detecting online conspiratorial content. Despite GPT's known strengths in its expressiveness and contextual understanding, our study revealed significant flaws in its logical reasoning, while also demonstrating comparable strengths from our classifiers. We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums and find that only one-third of the posts are classified as positive. This research sheds light on the potential applications of large language models in tasks demanding nuanced contextual comprehension.

* 12 pages, 6 tables, 1 figure, conference ICWSM_24

Via

Access Paper or Ask Questions

A Weakly Supervised Classifier and Dataset of White Supremacist Language

Jun 27, 2023

Michael Miller Yoder, Ahmad Diab, David West Brown, Kathleen M. Carley

Figure 1 for A Weakly Supervised Classifier and Dataset of White Supremacist Language

Figure 2 for A Weakly Supervised Classifier and Dataset of White Supremacist Language

Figure 3 for A Weakly Supervised Classifier and Dataset of White Supremacist Language

Figure 4 for A Weakly Supervised Classifier and Dataset of White Supremacist Language

Abstract:We present a dataset and classifier for detecting the language of white supremacist extremism, a growing issue in online hate speech. Our weakly supervised classifier is trained on large datasets of text from explicitly white supremacist domains paired with neutral and anti-racist data from similar domains. We demonstrate that this approach improves generalization performance to new domains. Incorporating anti-racist texts as counterexamples to white supremacist language mitigates bias.

* ACL 2023 short

Via

Access Paper or Ask Questions

Domain-robust VQA with diverse datasets and methods but no target labels

Mar 29, 2021

Mingda Zhang, Tristan Maidment, Ahmad Diab, Adriana Kovashka, Rebecca Hwa

Figure 1 for Domain-robust VQA with diverse datasets and methods but no target labels

Figure 2 for Domain-robust VQA with diverse datasets and methods but no target labels

Figure 3 for Domain-robust VQA with diverse datasets and methods but no target labels

Figure 4 for Domain-robust VQA with diverse datasets and methods but no target labels

Abstract:The observation that computer vision methods overfit to dataset specifics has inspired diverse attempts to make object recognition models robust to domain shifts. However, similar work on domain-robust visual question answering methods is very limited. Domain adaptation for VQA differs from adaptation for object recognition due to additional complexity: VQA models handle multimodal inputs, methods contain multiple steps with diverse modules resulting in complex optimization, and answer spaces in different datasets are vastly different. To tackle these challenges, we first quantify domain shifts between popular VQA datasets, in both visual and textual space. To disentangle shifts between datasets arising from different modalities, we also construct synthetic shifts in the image and question domains separately. Second, we test the robustness of different families of VQA methods (classic two-stream, transformer, and neuro-symbolic methods) to these shifts. Third, we test the applicability of existing domain adaptation methods and devise a new one to bridge VQA domain gaps, adjusted to specific VQA models. To emulate the setting of real-world generalization, we focus on unsupervised domain adaptation and the open-ended classification task formulation.

* To appear in CVPR 2021

Via

Access Paper or Ask Questions

Darknet and Deepnet Mining for Proactive Cybersecurity Threat Intelligence

Jul 28, 2016

Eric Nunes, Ahmad Diab, Andrew Gunn, Ericsson Marin, Vineet Mishra, Vivin Paliath, John Robertson, Jana Shakarian, Amanda Thart, Paulo Shakarian

Figure 1 for Darknet and Deepnet Mining for Proactive Cybersecurity Threat Intelligence

Figure 2 for Darknet and Deepnet Mining for Proactive Cybersecurity Threat Intelligence

Figure 3 for Darknet and Deepnet Mining for Proactive Cybersecurity Threat Intelligence

Figure 4 for Darknet and Deepnet Mining for Proactive Cybersecurity Threat Intelligence

Abstract:In this paper, we present an operational system for cyber threat intelligence gathering from various social platforms on the Internet particularly sites on the darknet and deepnet. We focus our attention to collecting information from hacker forum discussions and marketplaces offering products and services focusing on malicious hacking. We have developed an operational system for obtaining information from these sites for the purposes of identifying emerging cyber threats. Currently, this system collects on average 305 high-quality cyber threat warnings each week. These threat warnings include information on newly developed malware and exploits that have not yet been deployed in a cyber-attack. This provides a significant service to cyber-defenders. The system is significantly augmented through the use of various data mining and machine learning techniques. With the use of machine learning models, we are able to recall 92% of products in marketplaces and 80% of discussions on forums relating to malicious hacking with high precision. We perform preliminary analysis on the data collected, demonstrating its application to aid a security expert for better threat analysis.

* 6 page paper accepted to be presented at IEEE Intelligence and Security Informatics 2016 Tucson, Arizona USA September 27-30, 2016

Via

Access Paper or Ask Questions

Product Offerings in Malicious Hacker Markets

Jul 26, 2016

Ericsson Marin, Ahmad Diab, Paulo Shakarian

Figure 1 for Product Offerings in Malicious Hacker Markets

Figure 2 for Product Offerings in Malicious Hacker Markets

Figure 3 for Product Offerings in Malicious Hacker Markets

Abstract:Marketplaces specializing in malicious hacking products - including malware and exploits - have recently become more prominent on the darkweb and deepweb. We scrape 17 such sites and collect information about such products in a unified database schema. Using a combination of manual labeling and unsupervised clustering, we examine a corpus of products in order to understand their various categories and how they become specialized with respect to vendor and marketplace. This initial study presents how we effectively employed unsupervised techniques to this data as well as the types of insights we gained on various categories of malicious hacking products.

* 3 pages, 1 figure, 3 tables. Accepted for publication in IEEE Intelligence and Security Informatics (ISI2016)

Via

Access Paper or Ask Questions