Picture for Roberto-Rafael Maura-Rivero

Roberto-Rafael Maura-Rivero

Utility-inspired Reward Transformations Improve Reinforcement Learning Training of Language Models

Add code
Jan 08, 2025
Viaarxiv icon

Soft Condorcet Optimization for Ranking of General Agents

Add code
Nov 04, 2024
Viaarxiv icon