Picture for Gokul Swamy

Gokul Swamy

Diffusing States and Matching Scores: A New Framework for Imitation Learning

Add code
Oct 17, 2024
Viaarxiv icon

Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

Add code
Oct 06, 2024
Figure 1 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Figure 2 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Figure 3 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Figure 4 for Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
Viaarxiv icon

EvIL: Evolution Strategies for Generalisable Imitation Learning

Add code
Jun 15, 2024
Viaarxiv icon

Multi-Agent Imitation Learning: Value is Easy, Regret is Hard

Add code
Jun 06, 2024
Figure 1 for Multi-Agent Imitation Learning: Value is Easy, Regret is Hard
Figure 2 for Multi-Agent Imitation Learning: Value is Easy, Regret is Hard
Figure 3 for Multi-Agent Imitation Learning: Value is Easy, Regret is Hard
Figure 4 for Multi-Agent Imitation Learning: Value is Easy, Regret is Hard
Viaarxiv icon

Understanding Preference Fine-Tuning Through the Lens of Coverage

Add code
Jun 03, 2024
Viaarxiv icon

REBEL: Reinforcement Learning via Regressing Relative Rewards

Add code
Apr 25, 2024
Viaarxiv icon

Hybrid Inverse Reinforcement Learning

Add code
Feb 13, 2024
Viaarxiv icon

The Virtues of Pessimism in Inverse Reinforcement Learning

Add code
Feb 08, 2024
Figure 1 for The Virtues of Pessimism in Inverse Reinforcement Learning
Figure 2 for The Virtues of Pessimism in Inverse Reinforcement Learning
Figure 3 for The Virtues of Pessimism in Inverse Reinforcement Learning
Figure 4 for The Virtues of Pessimism in Inverse Reinforcement Learning
Viaarxiv icon

A Minimaximalist Approach to Reinforcement Learning from Human Feedback

Add code
Jan 08, 2024
Viaarxiv icon

Learning Shared Safety Constraints from Multi-task Demonstrations

Add code
Sep 01, 2023
Figure 1 for Learning Shared Safety Constraints from Multi-task Demonstrations
Figure 2 for Learning Shared Safety Constraints from Multi-task Demonstrations
Figure 3 for Learning Shared Safety Constraints from Multi-task Demonstrations
Figure 4 for Learning Shared Safety Constraints from Multi-task Demonstrations
Viaarxiv icon