Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lauren Klein

Provocations from the Humanities for Generative AI Research

Feb 26, 2025

Lauren Klein, Meredith Martin, André Brock, Maria Antoniak, Melanie Walsh, Jessica Marie Johnson, Lauren Tilton, David Mimno

Abstract:This paper presents a set of provocations for considering the uses, impact, and harms of generative AI from the perspective of humanities researchers. We provide a working definition of humanities research, summarize some of its most salient theories and methods, and apply these theories and methods to the current landscape of AI. Drawing from foundational work in critical data studies, along with relevant humanities scholarship, we elaborate eight claims with broad applicability to current conversations about generative AI: 1) Models make words, but people make meaning; 2) Generative AI requires an expanded definition of culture; 3) Generative AI can never be representative; 4) Bigger models are not always better models; 5) Not all training data is equivalent; 6) Openness is not an easy fix; 7) Limited access to compute enables corporate capture; and 8) AI universalism creates narrow human subjects. We conclude with a discussion of the importance of resisting the extraction of humanities research by computer science and related fields.

* working draft; final draft in preparation

Via

Access Paper or Ask Questions

Words and Action: Modeling Linguistic Leadership in #BlackLivesMatter Communities

Dec 03, 2024

Dani Roytburg, Deborah Olorunisola, Sandeep Soni, Lauren Klein

Figure 1 for Words and Action: Modeling Linguistic Leadership in #BlackLivesMatter Communities

Figure 2 for Words and Action: Modeling Linguistic Leadership in #BlackLivesMatter Communities

Figure 3 for Words and Action: Modeling Linguistic Leadership in #BlackLivesMatter Communities

Figure 4 for Words and Action: Modeling Linguistic Leadership in #BlackLivesMatter Communities

Abstract:In this project, we describe a method of modeling semantic leadership across a set of communities associated with the #BlackLivesMatter movement, which has been informed by qualitative research on the structure of social media and Black Twitter in particular. We describe our bespoke approaches to time-binning, community clustering, and connecting communities over time, as well as our adaptation of state-of-the-art approaches to semantic change detection and semantic leadership induction. We find substantial evidence of the leadership role of BLM activists and progressives, as well as Black celebrities. We also find evidence of the sustained engagement of the conservative community with this discourse, suggesting an alternative explanation for how we arrived at the present moment, in which "anti-woke" and "anti-CRT" bills are being enacted nationwide.

* Accepted at ICWSM 2025; minor revisions forthcoming

Via

Access Paper or Ask Questions

Data Feminism for AI

May 02, 2024

Lauren Klein, Catherine D'Ignazio

Abstract:This paper presents a set of intersectional feminist principles for conducting equitable, ethical, and sustainable AI research. In Data Feminism (2020), we offered seven principles for examining and challenging unequal power in data science. Here, we present a rationale for why feminism remains deeply relevant for AI research, rearticulate the original principles of data feminism with respect to AI, and introduce two potential new principles related to environmental impact and consent. Together, these principles help to 1) account for the unequal, undemocratic, extractive, and exclusionary forces at work in AI research, development, and deployment; 2) identify and mitigate predictable harms in advance of unsafe, discriminatory, or otherwise oppressive systems being released into the world; and 3) inspire creative, joyful, and collective ways to work towards a more equitable, sustainable world in which all of us can thrive.

* 21 pages, to be published in the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT '24)

Via

Access Paper or Ask Questions

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters

Jan 16, 2024

Li Lucy, Suchin Gururangan, Luca Soldaini, Emma Strubell, David Bamman, Lauren Klein, Jesse Dodge

Abstract:Large language models' (LLMs) abilities are drawn from their pretraining data, and model development begins with data curation. However, decisions around what data is retained or removed during this initial stage is under-scrutinized. In our work, we ground web text, which is a popular pretraining data source, to its social and geographic contexts. We create a new dataset of 10.3 million self-descriptions of website creators, and extract information about who they are and where they are from: their topical interests, social roles, and geographic affiliations. Then, we conduct the first study investigating how ten "quality" and English language identification (langID) filters affect webpages that vary along these social dimensions. Our experiments illuminate a range of implicit preferences in data curation: we show that some quality classifiers act like topical domain filters, and langID can overlook English content from some regions of the world. Overall, we hope that our work will encourage a new line of research on pretraining data curation practices and its social implications.

* 28 pages, 13 figures

Via

Access Paper or Ask Questions

Evaluating Temporal Patterns in Applied Infant Affect Recognition

Sep 07, 2022

Allen Chang, Lauren Klein, Marcelo R. Rosales, Weiyang Deng, Beth A. Smith, Maja J. Matarić

Figure 1 for Evaluating Temporal Patterns in Applied Infant Affect Recognition

Figure 2 for Evaluating Temporal Patterns in Applied Infant Affect Recognition

Figure 3 for Evaluating Temporal Patterns in Applied Infant Affect Recognition

Figure 4 for Evaluating Temporal Patterns in Applied Infant Affect Recognition

Abstract:Agents must monitor their partners' affective states continuously in order to understand and engage in social interactions. However, methods for evaluating affect recognition do not account for changes in classification performance that may occur during occlusions or transitions between affective states. This paper addresses temporal patterns in affect classification performance in the context of an infant-robot interaction, where infants' affective states contribute to their ability to participate in a therapeutic leg movement activity. To support robustness to facial occlusions in video recordings, we trained infant affect recognition classifiers using both facial and body features. Next, we conducted an in-depth analysis of our best-performing models to evaluate how performance changed over time as the models encountered missing data and changing infant affect. During time windows when features were extracted with high confidence, a unimodal model trained on facial features achieved the same optimal performance as multimodal models trained on both facial and body features. However, multimodal models outperformed unimodal models when evaluated on the entire dataset. Additionally, model performance was weakest when predicting an affective state transition and improved after multiple predictions of the same affective state. These findings emphasize the benefits of incorporating body features in continuous affect recognition for infants. Our work highlights the importance of evaluating variability in model performance both over time and in the presence of missing data when applying affect recognition to social interactions.

* 8 pages, 6 figures, 10th International Conference on Affective Computing and Intelligent Interaction (ACII 2022)

Via

Access Paper or Ask Questions

Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers

Mar 12, 2021

Sandeep Soni, Lauren Klein, Jacob Eisenstein

Figure 1 for Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers

Figure 2 for Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers

Figure 3 for Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers

Figure 4 for Abolitionist Networks: Modeling Language Change in Nineteenth-Century Activist Newspapers

Abstract:The abolitionist movement of the nineteenth-century United States remains among the most significant social and political movements in US history. Abolitionist newspapers played a crucial role in spreading information and shaping public opinion around a range of issues relating to the abolition of slavery. These newspapers also serve as a primary source of information about the movement for scholars today, resulting in powerful new accounts of the movement and its leaders. This paper supplements recent qualitative work on the role of women in abolition's vanguard, as well as the role of the Black press, with a quantitative text modeling approach. Using diachronic word embeddings, we identify which newspapers tended to lead lexical semantic innovations -- the introduction of new usages of specific words -- and which newspapers tended to follow. We then aggregate the evidence across hundreds of changes into a weighted network with the newspapers as nodes; directed edge weights represent the frequency with which each newspaper led the other in the adoption of a lexical semantic change. Analysis of this network reveals pathways of lexical semantic influence, distinguishing leaders from followers, as well as others who stood apart from the semantic changes that swept through this period. More specifically, we find that two newspapers edited by women -- THE PROVINCIAL FREEMAN and THE LILY -- led a large number of semantic changes in our corpus, lending additional credence to the argument that a multiracial coalition of women led the abolitionist movement in terms of both thought and action. It also contributes additional complexity to the scholarship that has sought to tease apart the relation of the abolitionist movement to the women's suffrage movement, and the vexed racial politics that characterized their relation.

* Journal of Cultural Analytics (2021)
* 23 pages, 6 figures, 2 tables

Via

Access Paper or Ask Questions