Picture for Christoph Dann

Christoph Dann

Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning

Add code
Jul 22, 2024
Viaarxiv icon

Rate-Preserving Reductions for Blackwell Approachability

Add code
Jun 10, 2024
Viaarxiv icon

A Minimaximalist Approach to Reinforcement Learning from Human Feedback

Add code
Jan 08, 2024
Viaarxiv icon

Data-Driven Regret Balancing for Online Model Selection in Bandits

Add code
Jun 05, 2023
Viaarxiv icon

A Blackbox Approach to Best of Both Worlds in Bandits and Beyond

Add code
Feb 20, 2023
Viaarxiv icon

Best of Both Worlds Policy Optimization

Add code
Feb 18, 2023
Viaarxiv icon

Pseudonorm Approachability and Applications to Regret Minimization

Add code
Feb 03, 2023
Viaarxiv icon

Learning in POMDPs is Sample-Efficient with Hindsight Observability

Add code
Feb 03, 2023
Viaarxiv icon

A Unified Algorithm for Stochastic Path Problems

Add code
Oct 17, 2022
Figure 1 for A Unified Algorithm for Stochastic Path Problems
Figure 2 for A Unified Algorithm for Stochastic Path Problems
Viaarxiv icon

A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning

Add code
Aug 23, 2022
Viaarxiv icon