Picture for Anikait Singh

Anikait Singh

Test-Time Alignment via Hypothesis Reweighting

Add code
Dec 11, 2024
Viaarxiv icon

Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation

Add code
Oct 03, 2024
Viaarxiv icon

D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning

Add code
Aug 15, 2024
Viaarxiv icon

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Add code
Apr 23, 2024
Figure 1 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 2 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 3 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 4 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Viaarxiv icon

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Add code
Oct 17, 2023
Figure 1 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 2 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 3 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 4 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Viaarxiv icon

Robotic Offline RL from Internet Videos via Value-Function Pre-Training

Add code
Sep 22, 2023
Viaarxiv icon

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Add code
Jul 28, 2023
Viaarxiv icon

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning

Add code
Mar 09, 2023
Viaarxiv icon

Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints

Add code
Nov 21, 2022
Viaarxiv icon

Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials

Add code
Oct 11, 2022
Figure 1 for Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Figure 2 for Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Figure 3 for Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Figure 4 for Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Viaarxiv icon