Picture for Fengshuo Bai

Fengshuo Bai

Efficient Model-agnostic Alignment via Bayesian Persuasion

Add code
May 29, 2024
Viaarxiv icon

Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects

Add code
Mar 01, 2024
Viaarxiv icon

Measuring Value Understanding in Language Models through Discriminator-Critique Gap

Add code
Oct 19, 2023
Viaarxiv icon

Zero-shot Preference Learning for Offline RL via Optimal Transport

Add code
Jun 06, 2023
Viaarxiv icon