Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Structured World Models from Human Videos

Aug 21, 2023

Russell Mendonca, Shikhar Bahl, Deepak Pathak

Figure 1 for Structured World Models from Human Videos

Figure 2 for Structured World Models from Human Videos

Figure 3 for Structured World Models from Human Videos

Figure 4 for Structured World Models from Human Videos

Share this with someone who'll enjoy it:

Abstract:We tackle the problem of learning complex, general behaviors directly in the real world. We propose an approach for robots to efficiently learn manipulation skills using only a handful of real-world interaction trajectories from many different settings. Inspired by the success of learning from large-scale datasets in the fields of computer vision and natural language, our belief is that in order to efficiently learn, a robot must be able to leverage internet-scale, human video data. Humans interact with the world in many interesting ways, which can allow a robot to not only build an understanding of useful actions and affordances but also how these actions affect the world for manipulation. Our approach builds a structured, human-centric action space grounded in visual affordances learned from human videos. Further, we train a world model on human videos and fine-tune on a small amount of robot interaction data without any task supervision. We show that this approach of affordance-space world models enables different robots to learn various manipulation skills in complex settings, in under 30 minutes of interaction. Videos can be found at https://human-world-model.github.io

* RSS 2023. Website at https://human-world-model.github.io

View paper on

Share this with someone who'll enjoy it:

Title:Structured World Models from Human Videos

Paper and Code