Picture for Jonathan Drechsel

Jonathan Drechsel

GRADIEND: Monosemantic Feature Learning within Neural Networks Applied to Gender Debiasing of Transformer Models

Add code
Feb 03, 2025
Viaarxiv icon