Picture for Hanna Wallach

Hanna Wallach

Validating LLM-as-a-Judge Systems in the Absence of Gold Labels

Add code
Mar 07, 2025
Viaarxiv icon

Toward an Evaluation Science for Generative AI Systems

Add code
Mar 07, 2025
Viaarxiv icon

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice

Add code
Dec 09, 2024
Figure 1 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 2 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 3 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 4 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Viaarxiv icon

A Framework for Evaluating LLMs Under Task Indeterminacy

Add code
Nov 21, 2024
Viaarxiv icon

A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications

Add code
Oct 26, 2023
Figure 1 for A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications
Figure 2 for A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications
Figure 3 for A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications
Figure 4 for A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications
Viaarxiv icon

"One-size-fits-all"? Observations and Expectations of NLG Systems Across Identity-Related Language Features

Add code
Oct 23, 2023
Viaarxiv icon

Measuring Representational Harms in Image Captioning

Add code
Jun 14, 2022
Figure 1 for Measuring Representational Harms in Image Captioning
Figure 2 for Measuring Representational Harms in Image Captioning
Figure 3 for Measuring Representational Harms in Image Captioning
Figure 4 for Measuring Representational Harms in Image Captioning
Viaarxiv icon

Understanding Machine Learning Practitioners' Data Documentation Perceptions, Needs, Challenges, and Desiderata

Add code
Jun 06, 2022
Figure 1 for Understanding Machine Learning Practitioners' Data Documentation Perceptions, Needs, Challenges, and Desiderata
Figure 2 for Understanding Machine Learning Practitioners' Data Documentation Perceptions, Needs, Challenges, and Desiderata
Viaarxiv icon

REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research

Add code
May 05, 2022
Figure 1 for REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research
Figure 2 for REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research
Figure 3 for REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research
Figure 4 for REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research
Viaarxiv icon

Assessing the Fairness of AI Systems: AI Practitioners' Processes, Challenges, and Needs for Support

Add code
Dec 10, 2021
Figure 1 for Assessing the Fairness of AI Systems: AI Practitioners' Processes, Challenges, and Needs for Support
Viaarxiv icon