Picture for Roberto-Rafael Maura-Rivero

Roberto-Rafael Maura-Rivero

Jackpot! Alignment as a Maximal Lottery

Add code
Jan 31, 2025
Viaarxiv icon

Utility-inspired Reward Transformations Improve Reinforcement Learning Training of Language Models

Add code
Jan 08, 2025
Viaarxiv icon

Soft Condorcet Optimization for Ranking of General Agents

Add code
Nov 04, 2024
Viaarxiv icon