Picture for Archit Sharma

Archit Sharma

Test-Time Alignment via Hypothesis Reweighting

Add code
Dec 11, 2024
Viaarxiv icon

Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone

Add code
Dec 09, 2024
Viaarxiv icon

Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval

Add code
Oct 31, 2024
Viaarxiv icon

Towards Data-Centric RLHF: Simple Metrics for Preference Dataset Comparison

Add code
Sep 15, 2024
Viaarxiv icon

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Add code
Apr 23, 2024
Figure 1 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 2 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 3 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 4 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Viaarxiv icon

Stream of Search : Learning to Search in Language

Add code
Apr 01, 2024
Viaarxiv icon

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

Add code
Mar 19, 2024
Figure 1 for DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Figure 2 for DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Figure 3 for DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Figure 4 for DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Viaarxiv icon

Yell At Your Robot: Improving On-the-Fly from Language Corrections

Add code
Mar 19, 2024
Viaarxiv icon

A Critical Evaluation of AI Feedback for Aligning Large Language Models

Add code
Feb 19, 2024
Figure 1 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Figure 2 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Figure 3 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Figure 4 for A Critical Evaluation of AI Feedback for Aligning Large Language Models
Viaarxiv icon

RLVF: Learning from Verbal Feedback without Overgeneralization

Add code
Feb 16, 2024
Viaarxiv icon