Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lauren Tilton

Cultural Evaluations of Vision-Language Models Have a Lot to Learn from Cultural Theory

May 28, 2025

Srishti Yadav, Lauren Tilton, Maria Antoniak, Taylor Arnold, Jiaang Li, Siddhesh Milind Pawar, Antonia Karamolegkou, Stella Frank, Zhaochong An, Negar Rostamzadeh(+3 more)

Abstract:Modern vision-language models (VLMs) often fail at cultural competency evaluations and benchmarks. Given the diversity of applications built upon VLMs, there is renewed interest in understanding how they encode cultural nuances. While individual aspects of this problem have been studied, we still lack a comprehensive framework for systematically identifying and annotating the nuanced cultural dimensions present in images for VLMs. This position paper argues that foundational methodologies from visual culture studies (cultural studies, semiotics, and visual studies) are necessary for cultural analysis of images. Building upon this review, we propose a set of five frameworks, corresponding to cultural dimensions, that must be considered for a more complete analysis of the cultural competencies of VLMs.

Via

Access Paper or Ask Questions

Provocations from the Humanities for Generative AI Research

Feb 26, 2025

Lauren Klein, Meredith Martin, André Brock, Maria Antoniak, Melanie Walsh, Jessica Marie Johnson, Lauren Tilton, David Mimno

Abstract:This paper presents a set of provocations for considering the uses, impact, and harms of generative AI from the perspective of humanities researchers. We provide a working definition of humanities research, summarize some of its most salient theories and methods, and apply these theories and methods to the current landscape of AI. Drawing from foundational work in critical data studies, along with relevant humanities scholarship, we elaborate eight claims with broad applicability to current conversations about generative AI: 1) Models make words, but people make meaning; 2) Generative AI requires an expanded definition of culture; 3) Generative AI can never be representative; 4) Bigger models are not always better models; 5) Not all training data is equivalent; 6) Openness is not an easy fix; 7) Limited access to compute enables corporate capture; and 8) AI universalism creates narrow human subjects. We conclude with a discussion of the importance of resisting the extraction of humanities research by computer science and related fields.

* working draft; final draft in preparation

Via

Access Paper or Ask Questions

Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models

Nov 07, 2024

Taylor Arnold, Lauren Tilton

Figure 1 for Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models

Figure 2 for Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models

Figure 3 for Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models

Figure 4 for Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models

Abstract:Many cultural institutions have made large digitized visual collections available online, often under permissible re-use licences. Creating interfaces for exploring and searching these collections is difficult, particularly in the absence of granular metadata. In this paper, we introduce a method for using state-of-the-art multimodal large language models (LLMs) to enable an open-ended, explainable search and discovery interface for visual collections. We show how our approach can create novel clustering and recommendation systems that avoid common pitfalls of methods based directly on visual embeddings. Of particular interest is the ability to offer concrete textual explanations of each recommendation without the need to preselect the features of interest. Together, these features can create a digital interface that is more open-ended and flexible while also being better suited to addressing privacy and ethical concerns. Through a case study using a collection of documentary photographs, we provide several metrics showing the efficacy and possibilities of our approach.

* 16 pages, CHR 2024: Computational Humanities Research Conference, December 4 - 6, 2024, Aarhus University, Denmark

Via

Access Paper or Ask Questions

Automated Image Color Mapping for a Historic Photographic Collection

Nov 07, 2024

Taylor Arnold, Lauren Tilton

Figure 1 for Automated Image Color Mapping for a Historic Photographic Collection

Figure 2 for Automated Image Color Mapping for a Historic Photographic Collection

Figure 3 for Automated Image Color Mapping for a Historic Photographic Collection

Figure 4 for Automated Image Color Mapping for a Historic Photographic Collection

Abstract:In the 1970s, the United States Environmental Protection Agency sponsored Documerica, a large-scale photography initiative to document environmental subjects nation-wide. While over 15,000 digitized public-domain photographs from the collection are available online, most of the images were scanned from damaged copies of the original prints. We present and evaluate a modified histogram matching technique based on the underlying chemistry of the prints for correcting the damaged images by using training data collected from a small set of undamaged prints. The entire set of color-adjusted Documerica images is made available in an open repository.

* 11 pages, CHR 2024: Computational Humanities Research Conference, December 4 - 6, 2024, Aarhus University, Denmark

Via

Access Paper or Ask Questions

Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm

Jun 28, 2018

Taylor Arnold, Lauren Tilton

Figure 1 for Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm

Figure 2 for Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm

Figure 3 for Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm

Figure 4 for Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm

Abstract:Word choice is dependent on the cultural context of writers and their subjects. Different words are used to describe similar actions, objects, and features based on factors such as class, race, gender, geography and political affinity. Exploratory techniques based on locating and counting words may, therefore, lead to conclusions that reinforce culturally inflected boundaries. We offer a new method, the DualNeighbors algorithm, for linking thematically similar documents both within and across discursive and linguistic barriers to reveal cross-cultural connections. Qualitative and quantitative evaluations of this technique are shown as applied to two cultural datasets of interest to researchers across the humanities and social sciences. An open-source implementation of the DualNeighbors algorithm is provided to assist in its application.

* Chosen for oral presentation at 2nd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2018)

Via

Access Paper or Ask Questions