Picture for Fahim Tajwar

Fahim Tajwar

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Add code
Apr 23, 2024
Figure 1 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 2 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 3 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 4 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Viaarxiv icon

Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration Bias

Add code
Oct 12, 2023
Viaarxiv icon

Conservative Prediction via Data-Driven Confidence Minimization

Add code
Jun 08, 2023
Viaarxiv icon

Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

Add code
Oct 20, 2022
Figure 1 for Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Figure 2 for Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Figure 3 for Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Figure 4 for Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Viaarxiv icon

When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning

Add code
Oct 19, 2022
Figure 1 for When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning
Figure 2 for When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning
Figure 3 for When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning
Figure 4 for When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning
Viaarxiv icon

Do Deep Networks Transfer Invariances Across Classes?

Add code
Mar 18, 2022
Figure 1 for Do Deep Networks Transfer Invariances Across Classes?
Figure 2 for Do Deep Networks Transfer Invariances Across Classes?
Figure 3 for Do Deep Networks Transfer Invariances Across Classes?
Figure 4 for Do Deep Networks Transfer Invariances Across Classes?
Viaarxiv icon

No True State-of-the-Art? OOD Detection Methods are Inconsistent across Datasets

Add code
Sep 12, 2021
Figure 1 for No True State-of-the-Art? OOD Detection Methods are Inconsistent across Datasets
Figure 2 for No True State-of-the-Art? OOD Detection Methods are Inconsistent across Datasets
Figure 3 for No True State-of-the-Art? OOD Detection Methods are Inconsistent across Datasets
Figure 4 for No True State-of-the-Art? OOD Detection Methods are Inconsistent across Datasets
Viaarxiv icon