Abstract:Climate change is one of the most pressing challenges of the 21st century, sparking widespread discourse across social media platforms. Activists, policymakers, and researchers seek to understand public sentiment and narratives while access to social media data has become increasingly restricted in the post-API era. In this study, we analyze a dataset of climate change-related tweets from X (formerly Twitter) shared in 2019, containing 730k tweets along with the shared images. Our approach integrates statistical analysis, image classification, object detection, and sentiment analysis to explore visual narratives in climate discourse. Additionally, we introduce a graphical user interface (GUI) to facilitate interactive data exploration. Our findings reveal key themes in climate communication, highlight sentiment divergence between images and text, and underscore the strengths and limitations of foundation models in analyzing social media imagery. By releasing our code and tools, we aim to support future research on the intersection of climate change, social media, and computer vision.
Abstract:Visual framing analysis is a key method in social sciences for determining common themes and concepts in a given discourse. To reduce manual effort, image clustering can significantly speed up the annotation process. In this work, we phrase the clustering task as a Minimum Cost Multicut Problem [MP]. Solutions to the MP have been shown to provide clusterings that maximize the posterior probability, solely from provided local, pairwise probabilities of two images belonging to the same cluster. We discuss the efficacy of numerous embedding spaces to detect visual frames and show its superiority over other clustering methods. To this end, we employ the climate change dataset \textit{ClimateTV} which contains images commonly used for visual frame analysis. For broad visual frames, DINOv2 is a suitable embedding space, while ConvNeXt V2 returns a larger number of clusters which contain fine-grain differences, i.e. speech and protest. Our insights into embedding space differences in combination with the optimal clustering - by definition - advances automated visual frame detection. Our code can be found at https://github.com/KathPra/MP4VisualFrameDetection.