Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yiqing Hua

Understanding BERT performance in propaganda analysis

Nov 11, 2019

Yiqing Hua

Figure 1 for Understanding BERT performance in propaganda analysis

Figure 2 for Understanding BERT performance in propaganda analysis

Abstract:In this paper, we describe our system used in the shared task for fine-grained propaganda analysis at sentence level. Despite the challenging nature of the task, our pretrained BERT model (team YMJA) fine tuned on the training dataset provided by the shared task scored 0.62 F1 on the test set and ranked third among 25 teams who participated in the contest. We present a set of illustrative experiments to better understand the performance of our BERT model on this shared task. Further, we explore beyond the given dataset for false-positive cases that likely to be produced by our system. We show that despite the high performance on the given testset, our system may have the tendency of classifying opinion pieces as propaganda and cannot distinguish quotations of propaganda speech from actual usage of propaganda techniques.

* Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda (2019)

Via

Access Paper or Ask Questions

WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community

Oct 31, 2018

Yiqing Hua, Cristian Danescu-Niculescu-Mizil, Dario Taraborelli, Nithum Thain, Jeffery Sorensen, Lucas Dixon

Figure 1 for WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community

Figure 2 for WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community

Figure 3 for WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community

Figure 4 for WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community

Abstract:We present a corpus that encompasses the complete history of conversations between contributors to Wikipedia, one of the largest online collaborative communities. By recording the intermediate states of conversations---including not only comments and replies, but also their modifications, deletions and restorations---this data offers an unprecedented view of online conversation. This level of detail supports new research questions pertaining to the process (and challenges) of large-scale online collaboration. We illustrate the corpus' potential with two case studies that highlight new perspectives on earlier work. First, we explore how a person's conversational behavior depends on how they relate to the discussion's venue. Second, we show that community moderation of toxic behavior happens at a higher rate than previously estimated. Finally the reconstruction framework is designed to be language agnostic, and we show that it can extract high quality conversational data in both Chinese and English.

* Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Via

Access Paper or Ask Questions

How To Backdoor Federated Learning

Oct 01, 2018

Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, Vitaly Shmatikov

Figure 1 for How To Backdoor Federated Learning

Figure 2 for How To Backdoor Federated Learning

Figure 3 for How To Backdoor Federated Learning

Figure 4 for How To Backdoor Federated Learning

Abstract:Federated learning enables thousands of participants to construct a deep learning model without sharing their private training data with each other. For example, multiple smartphones can jointly train a next-word predictor for keyboards without revealing what individual users type. We demonstrate that any participant in federated learning can introduce hidden backdoor functionality into the joint global model, e.g., to ensure that an image classifier assigns an attacker-chosen label to images with certain features, or that a word predictor completes certain sentences with an attacker-chosen word. We design and evaluate a new model-poisoning methodology based on model replacement. An attacker selected in a single round of federated learning can cause the global model to immediately reach 100% accuracy on the backdoor task. We evaluate the attack under different assumptions for the standard federated-learning tasks and show that it greatly outperforms data poisoning. Our generic constrain-and-scale technique also evades anomaly detection-based defenses by incorporating the evasion into the attacker's loss function during training.

Via

Access Paper or Ask Questions

Conversations Gone Awry: Detecting Early Signs of Conversational Failure

May 14, 2018

Justine Zhang, Jonathan P. Chang, Cristian Danescu-Niculescu-Mizil, Lucas Dixon, Yiqing Hua, Nithum Thain, Dario Taraborelli

Figure 1 for Conversations Gone Awry: Detecting Early Signs of Conversational Failure

Figure 2 for Conversations Gone Awry: Detecting Early Signs of Conversational Failure

Figure 3 for Conversations Gone Awry: Detecting Early Signs of Conversational Failure

Figure 4 for Conversations Gone Awry: Detecting Early Signs of Conversational Failure

Abstract:One of the main challenges online social systems face is the prevalence of antisocial behavior, such as harassment and personal attacks. In this work, we introduce the task of predicting from the very start of a conversation whether it will get out of hand. As opposed to detecting undesirable behavior after the fact, this task aims to enable early, actionable prediction at a time when the conversation might still be salvaged. To this end, we develop a framework for capturing pragmatic devices---such as politeness strategies and rhetorical prompts---used to start a conversation, and analyze their relation to its future trajectory. Applying this framework in a controlled setting, we demonstrate the feasibility of detecting early warning signs of antisocial behavior in online discussions.

* To appear in the Proceedings of ACL 2018, 15 pages, 1 figure. Data, quiz, code and additional information at http://www.cs.cornell.edu/~cristian/Conversations_gone_awry.html

Via

Access Paper or Ask Questions