Picture for Harshit Sikchi

Harshit Sikchi

Tony

OpenAI GPT-5 System Card

Add code
Dec 19, 2025
Viaarxiv icon

An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning

Add code
Apr 17, 2025
Viaarxiv icon

Fast Adaptation with Behavioral Foundation Models

Add code
Apr 10, 2025
Viaarxiv icon

CREStE: Scalable Mapless Navigation with Internet Scale Priors and Counterfactual Guidance

Add code
Mar 05, 2025
Viaarxiv icon

RL Zero: Zero-Shot Language to Behaviors without any Supervision

Add code
Dec 07, 2024
Viaarxiv icon

Proto Successor Measure: Representing the Space of All Possible Solutions of Reinforcement Learning

Add code
Nov 29, 2024
Viaarxiv icon

A Dual Approach to Imitation Learning from Observations with Offline Datasets

Add code
Jun 13, 2024
Figure 1 for A Dual Approach to Imitation Learning from Observations with Offline Datasets
Figure 2 for A Dual Approach to Imitation Learning from Observations with Offline Datasets
Figure 3 for A Dual Approach to Imitation Learning from Observations with Offline Datasets
Figure 4 for A Dual Approach to Imitation Learning from Observations with Offline Datasets
Viaarxiv icon

Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning

Add code
May 06, 2024
Figure 1 for Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
Figure 2 for Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
Figure 3 for Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
Figure 4 for Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning
Viaarxiv icon

Score Models for Offline Goal-Conditioned Reinforcement Learning

Add code
Nov 03, 2023
Viaarxiv icon

Contrastive Preference Learning: Learning from Human Feedback without RL

Add code
Oct 24, 2023
Figure 1 for Contrastive Preference Learning: Learning from Human Feedback without RL
Figure 2 for Contrastive Preference Learning: Learning from Human Feedback without RL
Figure 3 for Contrastive Preference Learning: Learning from Human Feedback without RL
Figure 4 for Contrastive Preference Learning: Learning from Human Feedback without RL
Viaarxiv icon