Picture for Alec Helyar

Alec Helyar

Rule Based Rewards for Language Model Safety

Add code
Nov 02, 2024
Viaarxiv icon

A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications

Add code
Oct 26, 2023
Viaarxiv icon