Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gerardine Meaney

Cultural Analytics for Good: Building Inclusive Evaluation Frameworks for Historical IR

Jan 17, 2026

Suchana Datta, Dwaipayan Roy, Derek Greene, Gerardine Meaney, Karen Wade, Philipp Mayr

Abstract:This work bridges the fields of information retrieval and cultural analytics to support equitable access to historical knowledge. Using the British Library BL19 digital collection (more than 35,000 works from 1700-1899), we construct a benchmark for studying changes in language, terminology and retrieval in the 19th-century fiction and non-fiction. Our approach combines expert-driven query design, paragraph-level relevance annotation, and Large Language Model (LLM) assistance to create a scalable evaluation framework grounded in human expertise. We focus on knowledge transfer from fiction to non-fiction, investigating how narrative understanding and semantic richness in fiction can improve retrieval for scholarly and factual materials. This interdisciplinary framework not only improves retrieval accuracy but also fosters interpretability, transparency, and cultural inclusivity in digital archives. Our work provides both practical evaluation resources and a methodological paradigm for developing retrieval systems that support richer, historically aware engagement with digital archives, ultimately working towards more emancipatory knowledge infrastructures.

Via

Access Paper or Ask Questions

Unveiling Temporal Trends in 19th Century Literature: An Information Retrieval Approach

Jan 12, 2025

Suchana Datta, Dwaipayan Roy, Derek Greene, Gerardine Meaney

Figure 1 for Unveiling Temporal Trends in 19th Century Literature: An Information Retrieval Approach

Figure 2 for Unveiling Temporal Trends in 19th Century Literature: An Information Retrieval Approach

Figure 3 for Unveiling Temporal Trends in 19th Century Literature: An Information Retrieval Approach

Figure 4 for Unveiling Temporal Trends in 19th Century Literature: An Information Retrieval Approach

Abstract:In English literature, the 19th century witnessed a significant transition in styles, themes, and genres. Consequently, the novels from this period display remarkable diversity. This paper explores these variations by examining the evolution of term usage in 19th century English novels through the lens of information retrieval. By applying a query expansion-based approach to a decade-segmented collection of fiction from the British Library, we examine how related terms vary over time. Our analysis employs multiple standard metrics including Kendall's tau, Jaccard similarity, and Jensen-Shannon divergence to assess overlaps and shifts in expanded query term sets. Our results indicate a significant degree of divergence in the related terms across decades as selected by the query expansion technique, suggesting substantial linguistic and conceptual changes throughout the 19th century novels.

* Accepted at JCDL 2024

Via

Access Paper or Ask Questions

Curatr: A Platform for Semantic Analysis and Curation of Historical Literary Texts

Jun 13, 2023

Susan Leavy, Gerardine Meaney, Karen Wade, Derek Greene

Abstract:The increasing availability of digital collections of historical and contemporary literature presents a wealth of possibilities for new research in the humanities. The scale and diversity of such collections however, presents particular challenges in identifying and extracting relevant content. This paper presents Curatr, an online platform for the exploration and curation of literature with machine learning-supported semantic search, designed within the context of digital humanities scholarship. The platform provides a text mining workflow that combines neural word embeddings with expert domain knowledge to enable the generation of thematic lexicons, allowing researches to curate relevant sub-corpora from a large corpus of 18th and 19th century digitised texts.

* Metadata and Semantic Research (MTSR 2019), Communications in Computer and Information Science, vol 1057. Springer, Cham
* 12 pages

Via

Access Paper or Ask Questions

Mitigating Gender Bias in Machine Learning Data Sets

May 18, 2020

Susan Leavy, Gerardine Meaney, Karen Wade, Derek Greene

Figure 1 for Mitigating Gender Bias in Machine Learning Data Sets

Figure 2 for Mitigating Gender Bias in Machine Learning Data Sets

Figure 3 for Mitigating Gender Bias in Machine Learning Data Sets

Figure 4 for Mitigating Gender Bias in Machine Learning Data Sets

Abstract:Artificial Intelligence has the capacity to amplify and perpetuate societal biases and presents profound ethical implications for society. Gender bias has been identified in the context of employment advertising and recruitment tools, due to their reliance on underlying language processing and recommendation algorithms. Attempts to address such issues have involved testing learned associations, integrating concepts of fairness to machine learning and performing more rigorous analysis of training data. Mitigating bias when algorithms are trained on textual data is particularly challenging given the complex way gender ideology is embedded in language. This paper proposes a framework for the identification of gender bias in training data for machine learning.The work draws upon gender theory and sociolinguistics to systematically indicate levels of bias in textual training data and associated neural word embedding models, thus highlighting pathways for both removing bias from training data and critically assessing its impact.

* 10 pages, 5 figures, 5 Tables, Presented as Bias2020 workshop (as part of the ECIR Conference) - http://bias.disim.univaq.it

Via

Access Paper or Ask Questions