Picture for Gokul Swamy

Gokul Swamy

Efficient Imitation Under Misspecification

Add code
Mar 17, 2025
Viaarxiv icon

All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning

Add code
Mar 03, 2025
Viaarxiv icon

From Foresight to Forethought: VLM-In-the-Loop Policy Steering via Latent Alignment

Add code
Feb 03, 2025
Viaarxiv icon

Your Learned Constraint is Secretly a Backward Reachable Tube

Add code
Jan 26, 2025
Viaarxiv icon

Diffusing States and Matching Scores: A New Framework for Imitation Learning

Add code
Oct 17, 2024
Viaarxiv icon

Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

Add code
Oct 06, 2024
Figure 1 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Figure 2 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Figure 3 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Figure 4 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Viaarxiv icon

EvIL: Evolution Strategies for Generalisable Imitation Learning

Add code
Jun 15, 2024
Viaarxiv icon

Multi-Agent Imitation Learning: Value is Easy, Regret is Hard

Add code
Jun 06, 2024
Figure 1 for Multi-Agent Imitation Learning: Value is Easy, Regret is Hard
Figure 2 for Multi-Agent Imitation Learning: Value is Easy, Regret is Hard
Figure 3 for Multi-Agent Imitation Learning: Value is Easy, Regret is Hard
Figure 4 for Multi-Agent Imitation Learning: Value is Easy, Regret is Hard
Viaarxiv icon

Understanding Preference Fine-Tuning Through the Lens of Coverage

Add code
Jun 03, 2024
Viaarxiv icon

REBEL: Reinforcement Learning via Regressing Relative Rewards

Add code
Apr 25, 2024
Viaarxiv icon