Picture for Fengshuo Bai

Fengshuo Bai

$β$-DQN: Improving Deep Q-Learning By Evolving the Behavior

Add code
Jan 01, 2025
Viaarxiv icon

RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors

Add code
Dec 14, 2024
Viaarxiv icon

Efficient Model-agnostic Alignment via Bayesian Persuasion

Add code
May 29, 2024
Viaarxiv icon

Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects

Add code
Mar 01, 2024
Viaarxiv icon

Measuring Value Understanding in Language Models through Discriminator-Critique Gap

Add code
Oct 19, 2023
Viaarxiv icon

Zero-shot Preference Learning for Offline RL via Optimal Transport

Add code
Jun 06, 2023
Viaarxiv icon