Picture for Rahul Kidambi

Rahul Kidambi

Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning

Add code
Jul 22, 2024
Figure 1 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Figure 2 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Figure 3 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Figure 4 for Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Viaarxiv icon

Auctions with LLM Summaries

Add code
Apr 11, 2024
Viaarxiv icon

A Minimaximalist Approach to Reinforcement Learning from Human Feedback

Add code
Jan 08, 2024
Viaarxiv icon

Enhancing Group Fairness in Online Settings Using Oblique Decision Forests

Add code
Oct 17, 2023
Viaarxiv icon

Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion

Add code
Apr 22, 2022
Figure 1 for Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion
Figure 2 for Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion
Figure 3 for Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion
Figure 4 for Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion
Viaarxiv icon

Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage

Add code
Jun 14, 2021
Figure 1 for Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Figure 2 for Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Figure 3 for Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Figure 4 for Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Viaarxiv icon

Making Paper Reviewing Robust to Bid Manipulation Attacks

Add code
Feb 22, 2021
Figure 1 for Making Paper Reviewing Robust to Bid Manipulation Attacks
Figure 2 for Making Paper Reviewing Robust to Bid Manipulation Attacks
Figure 3 for Making Paper Reviewing Robust to Bid Manipulation Attacks
Figure 4 for Making Paper Reviewing Robust to Bid Manipulation Attacks
Viaarxiv icon

Optimism is All You Need: Model-Based Imitation Learning From Observation Alone

Add code
Feb 22, 2021
Figure 1 for Optimism is All You Need: Model-Based Imitation Learning From Observation Alone
Figure 2 for Optimism is All You Need: Model-Based Imitation Learning From Observation Alone
Figure 3 for Optimism is All You Need: Model-Based Imitation Learning From Observation Alone
Figure 4 for Optimism is All You Need: Model-Based Imitation Learning From Observation Alone
Viaarxiv icon

Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy

Add code
Feb 15, 2021
Figure 1 for Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy
Figure 2 for Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy
Figure 3 for Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy
Figure 4 for Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy
Viaarxiv icon

MOReL : Model-Based Offline Reinforcement Learning

Add code
May 12, 2020
Figure 1 for MOReL : Model-Based Offline Reinforcement Learning
Figure 2 for MOReL : Model-Based Offline Reinforcement Learning
Figure 3 for MOReL : Model-Based Offline Reinforcement Learning
Figure 4 for MOReL : Model-Based Offline Reinforcement Learning
Viaarxiv icon