Picture for Xingzhou Lou

Xingzhou Lou

Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown

Add code
Oct 01, 2024
Viaarxiv icon

Position: Foundation Agents as the Paradigm Shift for Decision Making

Add code
May 29, 2024
Viaarxiv icon

SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling

Add code
May 21, 2024
Viaarxiv icon

TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient

Add code
Jan 15, 2024
Viaarxiv icon

Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models

Add code
Jan 15, 2024
Viaarxiv icon

PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination

Add code
Jan 16, 2023
Viaarxiv icon