Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lucy Havens

Investigating the Capabilities and Limitations of Machine Learning for Identifying Bias in English Language Data with Information and Heritage Professionals

Apr 01, 2025

Lucy Havens, Benjamin Bach, Melissa Terras, Beatrice Alex

Abstract:Despite numerous efforts to mitigate their biases, ML systems continue to harm already-marginalized people. While predominant ML approaches assume bias can be removed and fair models can be created, we show that these are not always possible, nor desirable, goals. We reframe the problem of ML bias by creating models to identify biased language, drawing attention to a dataset's biases rather than trying to remove them. Then, through a workshop, we evaluated the models for a specific use case: workflows of information and heritage professionals. Our findings demonstrate the limitations of ML for identifying bias due to its contextual nature, the way in which approaches to mitigating it can simultaneously privilege and oppress different communities, and its inevitability. We demonstrate the need to expand ML approaches to bias and fairness, providing a mixed-methods approach to investigating the feasibility of removing bias or achieving fairness in a given ML use case.

* Accepted to the 2025 CHI Conference on Human Factors in Computing Systems (CHI '25)

Via

Access Paper or Ask Questions

Situated Data, Situated Systems: A Methodology to Engage with Power Relations in Natural Language Processing Research

Nov 11, 2020

Lucy Havens, Melissa Terras, Benjamin Bach, Beatrice Alex

Figure 1 for Situated Data, Situated Systems: A Methodology to Engage with Power Relations in Natural Language Processing Research

Figure 2 for Situated Data, Situated Systems: A Methodology to Engage with Power Relations in Natural Language Processing Research

Abstract:We propose a bias-aware methodology to engage with power relations in natural language processing (NLP) research. NLP research rarely engages with bias in social contexts, limiting its ability to mitigate bias. While researchers have recommended actions, technical methods, and documentation practices, no methodology exists to integrate critical reflections on bias with technical NLP methods. In this paper, after an extensive and interdisciplinary literature review, we contribute a bias-aware methodology for NLP research. We also contribute a definition of biased text, a discussion of the implications of biased NLP systems, and a case study demonstrating how we are executing the bias-aware methodology in research on archival metadata descriptions.

* Accepted to the 2nd Workshop on Gender Bias in Natural Language Processing at COLING 2020

Via

Access Paper or Ask Questions