Abstract:In this paper, we present results of an auditing study performed over YouTube aimed at investigating how fast a user can get into a misinformation filter bubble, but also what it takes to "burst the bubble", i.e., revert the bubble enclosure. We employ a sock puppet audit methodology, in which pre-programmed agents (acting as YouTube users) delve into misinformation filter bubbles by watching misinformation promoting content. Then they try to burst the bubbles and reach more balanced recommendations by watching misinformation debunking content. We record search results, home page results, and recommendations for the watched videos. Overall, we recorded 17,405 unique videos, out of which we manually annotated 2,914 for the presence of misinformation. The labeled data was used to train a machine learning model classifying videos into three classes (promoting, debunking, neutral) with the accuracy of 0.82. We use the trained model to classify the remaining videos that would not be feasible to annotate manually. Using both the manually and automatically annotated data, we observe the misinformation bubble dynamics for a range of audited topics. Our key finding is that even though filter bubbles do not appear in some situations, when they do, it is possible to burst them by watching misinformation debunking content (albeit it manifests differently from topic to topic). We also observe a sudden decrease of misinformation filter bubble effect when misinformation debunking videos are watched after misinformation promoting videos, suggesting a strong contextuality of recommendations. Finally, when comparing our results with a previous similar study, we do not observe significant improvements in the overall quantity of recommended misinformation content.
Abstract:False information has a significant negative influence on individuals as well as on the whole society. Especially in the current COVID-19 era, we witness an unprecedented growth of medical misinformation. To help tackle this problem with machine learning approaches, we are publishing a feature-rich dataset of approx. 317k medical news articles/blogs and 3.5k fact-checked claims. It also contains 573 manually and more than 51k automatically labelled mappings between claims and articles. Mappings consist of claim presence, i.e., whether a claim is contained in a given article, and article stance towards the claim. We provide several baselines for these two tasks and evaluate them on the manually labelled part of the dataset. The dataset enables a number of additional tasks related to medical misinformation, such as misinformation characterisation studies or studies of misinformation diffusion between sources.
Abstract:The negative effects of misinformation filter bubbles in adaptive systems have been known to researchers for some time. Several studies investigated, most prominently on YouTube, how fast a user can get into a misinformation filter bubble simply by selecting wrong choices from the items offered. Yet, no studies so far have investigated what it takes to burst the bubble, i.e., revert the bubble enclosure. We present a study in which pre-programmed agents (acting as YouTube users) delve into misinformation filter bubbles by watching misinformation promoting content (for various topics). Then, by watching misinformation debunking content, the agents try to burst the bubbles and reach more balanced recommendation mixes. We recorded the search results and recommendations, which the agents encountered, and analyzed them for the presence of misinformation. Our key finding is that bursting of a filter bubble is possible, albeit it manifests differently from topic to topic. Moreover, we observe that filter bubbles do not truly appear in some situations. We also draw a direct comparison with a previous study. Sadly, we did not find much improvements in misinformation occurrences, despite recent pledges by YouTube.
Abstract:The online spreading of fake news is a major issue threatening entire societies. Much of this spreading is enabled by new media formats, namely social networks and online media sites. Researchers and practitioners have been trying to answer this by characterizing the fake news and devising automated methods for detecting them. The detection methods had so far only limited success, mostly due to the complexity of the news content and context and lack of properly annotated datasets. One possible way to boost the efficiency of automated misinformation detection methods, is to imitate the detection work of humans. It is also important to understand the news consumption behavior of online users. In this paper, we present an eye-tracking study, in which we let 44 lay participants to casually read through a social media feed containing posts with news articles, some of which were fake. In a second run, we asked the participants to decide on the truthfulness of these articles. We also describe a follow-up qualitative study with a similar scenario but this time with 7 expert fake news annotators. We present the description of both studies, characteristics of the resulting dataset (which we hereby publish) and several findings.