Picture for Ayzaan Wahid

Ayzaan Wahid

Vision Language Models are In-Context Value Learners

Add code
Nov 07, 2024
Figure 1 for Vision Language Models are In-Context Value Learners
Figure 2 for Vision Language Models are In-Context Value Learners
Figure 3 for Vision Language Models are In-Context Value Learners
Figure 4 for Vision Language Models are In-Context Value Learners
Viaarxiv icon

ALOHA Unleashed: A Simple Recipe for Robot Dexterity

Add code
Oct 17, 2024
Viaarxiv icon

Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers

Add code
Mar 19, 2024
Figure 1 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Figure 2 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Figure 3 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Figure 4 for Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Viaarxiv icon

Learning to Learn Faster from Human Feedback with Language Model Predictive Control

Add code
Feb 18, 2024
Figure 1 for Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Figure 2 for Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Figure 3 for Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Figure 4 for Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Viaarxiv icon

PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

Add code
Feb 12, 2024
Figure 1 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Figure 2 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Figure 3 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Figure 4 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Viaarxiv icon

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Add code
Oct 17, 2023
Figure 1 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 2 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 3 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Figure 4 for Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Viaarxiv icon

Video Language Planning

Add code
Oct 16, 2023
Viaarxiv icon

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

Add code
Jul 28, 2023
Viaarxiv icon

PaLM-E: An Embodied Multimodal Language Model

Add code
Mar 06, 2023
Figure 1 for PaLM-E: An Embodied Multimodal Language Model
Figure 2 for PaLM-E: An Embodied Multimodal Language Model
Figure 3 for PaLM-E: An Embodied Multimodal Language Model
Figure 4 for PaLM-E: An Embodied Multimodal Language Model
Viaarxiv icon

Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models

Add code
Nov 22, 2022
Viaarxiv icon