Abstract:An increasing number of individuals are willing to post states and opinions in social media, which has become a valuable data resource for studying human health. Furthermore, social media has been a crucial research point for healthcare now. This paper outlines the methods in our participation in the #SMM4H 2023 Shared Tasks, including data preprocessing, continual pre-training and fine-tuned optimization strategies. Especially for the Named Entity Recognition (NER) task, we utilize the model architecture named W2NER that effectively enhances the model generalization ability. Our method achieved first place in the Task 3. This paper has been peer-reviewed and accepted for presentation at the #SMM4H 2023 Workshop.
Abstract:Public opinion is a crucial factor in shaping political decision-making. Nowadays, social media has become an essential platform for individuals to engage in political discussions and express their political views, presenting researchers with an invaluable resource for analyzing public opinion. In this paper, we focus on the 2020 US presidential election and create a large-scale dataset from Twitter. To detect political opinions in tweets, we build a user-tweet bipartite graph based on users' posting and retweeting behaviors and convert the task into a Graph Neural Network (GNN)-based node classification problem. Then, we introduce a novel skip aggregation mechanism that makes tweet nodes aggregate information from second-order neighbors, which are also tweet nodes due to the graph's bipartite nature, effectively leveraging user behavioral information. The experimental results show that our proposed model significantly outperforms several competitive baselines. Further analyses demonstrate the significance of user behavioral information and the effectiveness of skip aggregation.
Abstract:Given the development and abundance of social media, studying the stance of social media users is a challenging and pressing issue. Social media users express their stance by posting tweets and retweeting. Therefore, the homogeneous relationship between users and the heterogeneous relationship between users and tweets are relevant for the stance detection task. Recently, graph neural networks (GNNs) have developed rapidly and have been applied to social media research. In this paper, we crawl a large-scale dataset of the 2020 US presidential election and automatically label all users by manually tagged hashtags. Subsequently, we propose a bipartite graph neural network model, DoubleH, which aims to better utilize homogeneous and heterogeneous information in user stance detection tasks. Specifically, we first construct a bipartite graph based on posting and retweeting relations for two kinds of nodes, including users and tweets. We then iteratively update the node's representation by extracting and separately processing heterogeneous and homogeneous information in the node's neighbors. Finally, the representations of user nodes are used for user stance classification. Experimental results show that DoubleH outperforms the state-of-the-art methods on popular benchmarks. Further analysis illustrates the model's utilization of information and demonstrates stability and efficiency at different numbers of layers.