Abstract:Community Question-Answering (CQA) forums have revolutionized how people seek information, especially those related to their healthcare needs, placing their trust in the collective wisdom of the public. However, there can be several answers in response to a single query, which makes it hard to grasp the key information related to the specific health concern. Typically, CQA forums feature a single top-voted answer as a representative summary for each query. However, a single answer overlooks the alternative solutions and other information frequently offered in other responses. Our research focuses on aspect-based summarization of health answers to address this limitation. Summarization of responses under different aspects such as suggestions, information, personal experiences, and questions can enhance the usability of the platforms. We formalize a multi-stage annotation guideline and contribute a unique dataset comprising aspect-based human-written health answer summaries. We build an automated multi-faceted answer summarization pipeline with this dataset based on task-specific fine-tuning of several state-of-the-art models. The pipeline leverages question similarity to retrieve relevant answer sentences, subsequently classifying them into the appropriate aspect type. Following this, we employ several recent abstractive summarization models to generate aspect-based summaries. Finally, we present a comprehensive human analysis and find that our summaries rank high in capturing relevant content and a wide range of solutions.
Abstract:While exposure to diverse viewpoints may reduce polarization, it can also have a backfire effect and exacerbate polarization when the discussion is adversarial. Here, we examine the question whether intergroup interactions around important events affect polarization between majority and minority groups in social networks. We compile data on the religious identity of nearly 700,000 Indian Twitter users engaging in COVID-19-related discourse during 2020. We introduce a new measure for an individual's group conformity based on contextualized embeddings of tweet text, which helps us assess polarization between religious groups. We then use a meta-learning framework to examine heterogeneous treatment effects of intergroup interactions on an individual's group conformity in the light of communal, political, and socio-economic events. We find that for political and social events, intergroup interactions reduce polarization. This decline is weaker for individuals at the extreme who already exhibit high conformity to their group. In contrast, during communal events, intergroup interactions can increase group conformity. Finally, we decompose the differential effects across religious groups in terms of emotions and topics of discussion. The results show that the dynamics of religious polarization are sensitive to the context and have important implications for understanding the role of intergroup interactions.
Abstract:Demographic inference from text has received a surge of attention in the field of natural language processing in the last decade. In this paper, we use personal names to infer religion in South Asia - where religion is a salient social division, and yet, disaggregated data on it remains scarce. Existing work predicts religion using dictionary based method, and therefore, can not classify unseen names. We use character based models which learn character patterns and, therefore, can classify unseen names as well with high accuracy. These models are also much faster and can easily be scaled to large data sets. We improve our classifier by combining the name of an individual with that of their parent/spouse and achieve remarkably high accuracy. Finally, we trace the classification decisions of a convolutional neural network model using layer-wise relevance propagation which can explain the predictions of complex non-linear classifiers and circumvent their purported black box nature. We show how character patterns learned by the classifier are rooted in the linguistic origins of names.