Picture for Katherine Metcalf

Katherine Metcalf

On the Way to LLM Personalization: Learning to Remember User Conversations

Add code
Nov 20, 2024
Viaarxiv icon

PREDICT: Preference Reasoning by Evaluating Decomposed preferences Inferred from Candidate Trajectories

Add code
Oct 08, 2024
Viaarxiv icon

On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization

Add code
Sep 05, 2024
Figure 1 for On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
Figure 2 for On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
Figure 3 for On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
Figure 4 for On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
Viaarxiv icon

Hindsight PRIORs for Reward Learning from Human Preferences

Add code
Apr 12, 2024
Viaarxiv icon

Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards

Add code
Feb 28, 2024
Viaarxiv icon

Large Language Models as Generalizable Policies for Embodied Tasks

Add code
Oct 26, 2023
Figure 1 for Large Language Models as Generalizable Policies for Embodied Tasks
Figure 2 for Large Language Models as Generalizable Policies for Embodied Tasks
Figure 3 for Large Language Models as Generalizable Policies for Embodied Tasks
Figure 4 for Large Language Models as Generalizable Policies for Embodied Tasks
Viaarxiv icon

Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning

Add code
Nov 12, 2022
Figure 1 for Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning
Figure 2 for Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning
Figure 3 for Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning
Figure 4 for Rewards Encoding Environment Dynamics Improves Preference-based Reinforcement Learning
Viaarxiv icon

Symbol Guided Hindsight Priors for Reward Learning from Human Preferences

Add code
Oct 19, 2022
Figure 1 for Symbol Guided Hindsight Priors for Reward Learning from Human Preferences
Figure 2 for Symbol Guided Hindsight Priors for Reward Learning from Human Preferences
Viaarxiv icon

Towards a Perceptual Model for Estimating the Quality of Visual Speech

Add code
Mar 24, 2022
Figure 1 for Towards a Perceptual Model for Estimating the Quality of Visual Speech
Figure 2 for Towards a Perceptual Model for Estimating the Quality of Visual Speech
Figure 3 for Towards a Perceptual Model for Estimating the Quality of Visual Speech
Figure 4 for Towards a Perceptual Model for Estimating the Quality of Visual Speech
Viaarxiv icon

FedEmbed: Personalized Private Federated Learning

Add code
Feb 18, 2022
Figure 1 for FedEmbed: Personalized Private Federated Learning
Figure 2 for FedEmbed: Personalized Private Federated Learning
Figure 3 for FedEmbed: Personalized Private Federated Learning
Figure 4 for FedEmbed: Personalized Private Federated Learning
Viaarxiv icon