Picture for Richard Edgar

Richard Edgar

Steering Language Model Refusal with Sparse Autoencoders

Add code
Nov 18, 2024
Viaarxiv icon

Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine

Add code
Nov 28, 2023
Viaarxiv icon

A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications

Add code
Oct 26, 2023
Viaarxiv icon

Fairlearn: Assessing and Improving Fairness of AI Systems

Add code
Mar 29, 2023
Viaarxiv icon