Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hemank Lamba

NLP for Social Good: A Survey of Challenges, Opportunities, and Responsible Deployment

May 28, 2025

Antonia Karamolegkou, Angana Borah, Eunjung Cho, Sagnik Ray Choudhury, Martina Galletti, Rajarshi Ghosh, Pranav Gupta, Oana Ignat, Priyanka Kargupta, Neema Kotonya(+22 more)

Abstract:Recent advancements in large language models (LLMs) have unlocked unprecedented possibilities across a range of applications. However, as a community, we believe that the field of Natural Language Processing (NLP) has a growing need to approach deployment with greater intentionality and responsibility. In alignment with the broader vision of AI for Social Good (Toma\v{s}ev et al., 2020), this paper examines the role of NLP in addressing pressing societal challenges. Through a cross-disciplinary analysis of social goals and emerging risks, we highlight promising research directions and outline challenges that must be addressed to ensure responsible and equitable progress in NLP4SG research.

Via

Access Paper or Ask Questions

CEHA: A Dataset of Conflict Events in the Horn of Africa

Dec 18, 2024

Rui Bai, Di Lu, Shihao Ran, Elizabeth Olson, Hemank Lamba, Aoife Cahill, Joel Tetreault, Alex Jaimes

Figure 1 for CEHA: A Dataset of Conflict Events in the Horn of Africa

Figure 2 for CEHA: A Dataset of Conflict Events in the Horn of Africa

Figure 3 for CEHA: A Dataset of Conflict Events in the Horn of Africa

Figure 4 for CEHA: A Dataset of Conflict Events in the Horn of Africa

Abstract:Natural Language Processing (NLP) of news articles can play an important role in understanding the dynamics and causes of violent conflict. Despite the availability of datasets categorizing various conflict events, the existing labels often do not cover all of the fine-grained violent conflict event types relevant to areas like the Horn of Africa. In this paper, we introduce a new benchmark dataset Conflict Events in the Horn of Africa region (CEHA) and propose a new task for identifying violent conflict events using online resources with this dataset. The dataset consists of 500 English event descriptions regarding conflict events in the Horn of Africa region with fine-grained event-type definitions that emphasize the cause of the conflict. This dataset categorizes the key types of conflict risk according to specific areas required by stakeholders in the Humanitarian-Peace-Development Nexus. Additionally, we conduct extensive experiments on two tasks supported by this dataset: Event-relevance Classification and Event-type Classification. Our baseline models demonstrate the challenging nature of these tasks and the usefulness of our dataset for model evaluations in low-resource settings with limited number of training data.

* Accepted by COLING 2025

Via

Access Paper or Ask Questions

Uchaguzi-2022: A Dataset of Citizen Reports on the 2022 Kenyan Election

Dec 17, 2024

Roberto Mondini, Neema Kotonya, Robert L. Logan IV, Elizabeth M Olson, Angela Oduor Lungati, Daniel Duke Odongo, Tim Ombasa, Hemank Lamba, Aoife Cahill, Joel R. Tetreault(+1 more)

Figure 1 for Uchaguzi-2022: A Dataset of Citizen Reports on the 2022 Kenyan Election

Figure 2 for Uchaguzi-2022: A Dataset of Citizen Reports on the 2022 Kenyan Election

Figure 3 for Uchaguzi-2022: A Dataset of Citizen Reports on the 2022 Kenyan Election

Figure 4 for Uchaguzi-2022: A Dataset of Citizen Reports on the 2022 Kenyan Election

Abstract:Online reporting platforms have enabled citizens around the world to collectively share their opinions and report in real time on events impacting their local communities. Systematically organizing (e.g., categorizing by attributes) and geotagging large amounts of crowdsourced information is crucial to ensuring that accurate and meaningful insights can be drawn from this data and used by policy makers to bring about positive change. These tasks, however, typically require extensive manual annotation efforts. In this paper we present Uchaguzi-2022, a dataset of 14k categorized and geotagged citizen reports related to the 2022 Kenyan General Election containing mentions of election-related issues such as official misconduct, vote count irregularities, and acts of violence. We use this dataset to investigate whether language models can assist in scalably categorizing and geotagging reports, thus highlighting its potential application in the AI for Social Good space.

* COLING 2025

Via

Access Paper or Ask Questions

HumVI: A Multilingual Dataset for Detecting Violent Incidents Impacting Humanitarian Aid

Oct 08, 2024

Hemank Lamba, Anton Abilov, Ke Zhang, Elizabeth M. Olson, Henry k. Dambanemuya, João c. Bárcia, David S. Batista, Christina Wille, Aoife Cahill, Joel Tetreault(+1 more)

Figure 1 for HumVI: A Multilingual Dataset for Detecting Violent Incidents Impacting Humanitarian Aid

Figure 2 for HumVI: A Multilingual Dataset for Detecting Violent Incidents Impacting Humanitarian Aid

Figure 3 for HumVI: A Multilingual Dataset for Detecting Violent Incidents Impacting Humanitarian Aid

Figure 4 for HumVI: A Multilingual Dataset for Detecting Violent Incidents Impacting Humanitarian Aid

Abstract:Humanitarian organizations can enhance their effectiveness by analyzing data to discover trends, gather aggregated insights, manage their security risks, support decision-making, and inform advocacy and funding proposals. However, data about violent incidents with direct impact and relevance for humanitarian aid operations is not readily available. An automatic data collection and NLP-backed classification framework aligned with humanitarian perspectives can help bridge this gap. In this paper, we present HumVI - a dataset comprising news articles in three languages (English, French, Arabic) containing instances of different types of violent incidents categorized by the humanitarian sector they impact, e.g., aid security, education, food security, health, and protection. Reliable labels were obtained for the dataset by partnering with a data-backed humanitarian organization, Insecurity Insight. We provide multiple benchmarks for the dataset, employing various deep learning architectures and techniques, including data augmentation and mask loss, to address different task-related challenges, e.g., domain expansion. The dataset is publicly available at https://github.com/dataminr-ai/humvi-dataset.

Via

Access Paper or Ask Questions

Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from Large Language Models

Feb 19, 2024

Puxuan Yu, Daniel Cohen, Hemank Lamba, Joel Tetreault, Alex Jaimes

Abstract:The process of scale calibration in ranking systems involves adjusting the outputs of rankers to correspond with significant qualities like click-through rates or relevance, crucial for mirroring real-world value and thereby boosting the system's effectiveness and reliability. Although there has been research on calibrated ranking losses within learning-to-rank models, the particular issue of adjusting the scale for neural rankers, which excel in handling textual information, has not been thoroughly examined. Neural ranking models are adept at processing text data, yet the application of existing scale calibration techniques to these models poses significant challenges due to their complexity and the intensive training they require, often resulting in suboptimal outcomes. This study delves into the potential of large language models (LLMs) to provide uncertainty measurements for a query and document pair that correlate with the scale-calibrated scores. By employing Monte Carlo sampling to gauge relevance probabilities from LLMs and incorporating natural language explanations (NLEs) to articulate this uncertainty, we carry out comprehensive tests on two major document ranking datasets. Our findings reveal that the approach leveraging NLEs outperforms existing calibration methods under various training scenarios, leading to better calibrated neural rankers.

Via

Access Paper or Ask Questions

Dissecting users' needs for search result explanations

Jan 29, 2024

Prerna Juneja, Wenjuan Zhang, Alison Marie Smith-Renner, Hemank Lamba, Joel Tetreault, Alex Jaimes

Figure 1 for Dissecting users' needs for search result explanations

Figure 2 for Dissecting users' needs for search result explanations

Figure 3 for Dissecting users' needs for search result explanations

Figure 4 for Dissecting users' needs for search result explanations

Abstract:There is a growing demand for transparency in search engines to understand how search results are curated and to enhance users' trust. Prior research has introduced search result explanations with a focus on how to explain, assuming explanations are beneficial. Our study takes a step back to examine if search explanations are needed and when they are likely to provide benefits. Additionally, we summarize key characteristics of helpful explanations and share users' perspectives on explanation features provided by Google and Bing. Interviews with non-technical individuals reveal that users do not always seek or understand search explanations and mostly desire them for complex and critical tasks. They find Google's search explanations too obvious but appreciate the ability to contest search results. Based on our findings, we offer design recommendations for search engines and explanations to help users better evaluate search results and enhance their search experience.

Via

Access Paper or Ask Questions

Counterfactual Editing for Search Result Explanation

Jan 25, 2023

Zhichao Xu, Hemank Lamba, Qingyao Ai, Joel Tetreault, Alex Jaimes

Figure 1 for Counterfactual Editing for Search Result Explanation

Figure 2 for Counterfactual Editing for Search Result Explanation

Figure 3 for Counterfactual Editing for Search Result Explanation

Figure 4 for Counterfactual Editing for Search Result Explanation

Abstract:Recently substantial improvements in neural retrieval methods also bring to light the inherent blackbox nature of these methods, especially when viewed from an explainability perspective. Most of existing works on Search Result Explanation (SeRE) are designed to provide factual explanation, i.e. to find/generate supporting evidence about documents' relevance to search queries. However, research in cognitive sciences have shown that human explanations are contrastive i.e. people explain an observed event using some counterfactual events; such explanations reduce cognitive load, and provide actionable insights. Though already proven effective in machine learning and NLP communities, the formulation and impact of counterfactual explanations have not been well studied for search systems. In this work, we aim to investigate the effectiveness of this perspective via proposing and evaluating counterfactual explanations for the task of SeRE. Specifically, we first conduct a user study where we investigate if counterfactual explanations indeed improve search sessions' effectiveness. Taking this as a motivation, we discuss the desiderata that an ideal counterfactual explanation method for SeRE should adhere to. Next, we propose a method $\text{CFE}^2$ (\textbf{C}ounter\textbf{F}actual \textbf{E}xplanation with \textbf{E}diting) to provide pairwise explanations to search engine result page. Finally, we showcase that the proposed method when evaluated on four publicly available datasets outperforms baselines on both metrics and human evaluation.

* work in progress

Via

Access Paper or Ask Questions

An Empirical Comparison of Bias Reduction Methods on Real-World Problems in High-Stakes Policy Settings

May 13, 2021

Hemank Lamba, Kit T. Rodolfa, Rayid Ghani

Figure 1 for An Empirical Comparison of Bias Reduction Methods on Real-World Problems in High-Stakes Policy Settings

Figure 2 for An Empirical Comparison of Bias Reduction Methods on Real-World Problems in High-Stakes Policy Settings

Figure 3 for An Empirical Comparison of Bias Reduction Methods on Real-World Problems in High-Stakes Policy Settings

Figure 4 for An Empirical Comparison of Bias Reduction Methods on Real-World Problems in High-Stakes Policy Settings

Abstract:Applications of machine learning (ML) to high-stakes policy settings -- such as education, criminal justice, healthcare, and social service delivery -- have grown rapidly in recent years, sparking important conversations about how to ensure fair outcomes from these systems. The machine learning research community has responded to this challenge with a wide array of proposed fairness-enhancing strategies for ML models, but despite the large number of methods that have been developed, little empirical work exists evaluating these methods in real-world settings. Here, we seek to fill this research gap by investigating the performance of several methods that operate at different points in the ML pipeline across four real-world public policy and social good problems. Across these problems, we find a wide degree of variability and inconsistency in the ability of many of these methods to improve model fairness, but post-processing by choosing group-specific score thresholds consistently removes disparities, with important implications for both the ML research community and practitioners deploying machine learning to inform consequential policy decisions.

* 17 pages, 9 figures, 2 tables

Via

Access Paper or Ask Questions

Machine learning for public policy: Do we need to sacrifice accuracy to make models fair?

Dec 05, 2020

Kit T. Rodolfa, Hemank Lamba, Rayid Ghani

Figure 1 for Machine learning for public policy: Do we need to sacrifice accuracy to make models fair?

Figure 2 for Machine learning for public policy: Do we need to sacrifice accuracy to make models fair?

Figure 3 for Machine learning for public policy: Do we need to sacrifice accuracy to make models fair?

Figure 4 for Machine learning for public policy: Do we need to sacrifice accuracy to make models fair?

Abstract:Growing applications of machine learning in policy settings have raised concern for fairness implications, especially for racial minorities, but little work has studied the practical trade-offs between fairness and accuracy in real-world settings. This empirical study fills this gap by investigating the accuracy cost of mitigating disparities across several policy settings, focusing on the common context of using machine learning to inform benefit allocation in resource-constrained programs across education, mental health, criminal justice, and housing safety. In each setting, explicitly focusing on achieving equity and using our proposed post-hoc disparity mitigation methods, fairness was substantially improved without sacrificing accuracy, challenging the commonly held assumption that reducing disparities either requires accepting an appreciable drop in accuracy or the development of novel, complex methods.

* 40 pages, 4 figures, 7 supplementary figures, 5 supplementary tables

Via

Access Paper or Ask Questions

Explainable Machine Learning for Public Policy: Use Cases, Gaps, and Research Directions

Oct 27, 2020

Kasun Amarasinghe, Kit Rodolfa, Hemank Lamba, Rayid Ghani

Figure 1 for Explainable Machine Learning for Public Policy: Use Cases, Gaps, and Research Directions

Figure 2 for Explainable Machine Learning for Public Policy: Use Cases, Gaps, and Research Directions

Figure 3 for Explainable Machine Learning for Public Policy: Use Cases, Gaps, and Research Directions

Abstract:In Machine Learning (ML) models used for supporting decisions in high-stakes domains such as public policy, explainability is crucial for adoption and effectiveness. While the field of explainable ML has expanded in recent years, much of this work does not take real-world needs into account. A majority of proposed methods use benchmark ML problems with generic explainability goals without clear use-cases or intended end-users. As a result, the effectiveness of this large body of theoretical and methodological work on real-world applications is unclear. This paper focuses on filling this void for the domain of public policy. We develop a taxonomy of explainability use-cases within public policy problems; for each use-case, we define the end-users of explanations and the specific goals explainability has to fulfill; third, we map existing work to these use-cases, identify gaps, and propose research directions to fill those gaps in order to have practical policy impact through ML.

Via

Access Paper or Ask Questions