Picture for Sean Kirmani

Sean Kirmani

Vision Language Models are In-Context Value Learners

Add code
Nov 07, 2024
Figure 1 for Vision Language Models are In-Context Value Learners
Figure 2 for Vision Language Models are In-Context Value Learners
Figure 3 for Vision Language Models are In-Context Value Learners
Figure 4 for Vision Language Models are In-Context Value Learners
Viaarxiv icon

STEER: Flexible Robotic Manipulation via Dense Language Grounding

Add code
Nov 05, 2024
Viaarxiv icon

RT-Affordance: Affordances are Versatile Intermediate Representations for Robot Manipulation

Add code
Nov 05, 2024
Viaarxiv icon

Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation

Add code
Sep 24, 2024
Viaarxiv icon

Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs

Add code
Jul 10, 2024
Figure 1 for Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs
Figure 2 for Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs
Figure 3 for Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs
Figure 4 for Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs
Viaarxiv icon

Evaluating Real-World Robot Manipulation Policies in Simulation

Add code
May 09, 2024
Figure 1 for Evaluating Real-World Robot Manipulation Policies in Simulation
Figure 2 for Evaluating Real-World Robot Manipulation Policies in Simulation
Figure 3 for Evaluating Real-World Robot Manipulation Policies in Simulation
Figure 4 for Evaluating Real-World Robot Manipulation Policies in Simulation
Viaarxiv icon

RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches

Add code
Mar 05, 2024
Viaarxiv icon

Learning to Learn Faster from Human Feedback with Language Model Predictive Control

Add code
Feb 18, 2024
Figure 1 for Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Figure 2 for Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Figure 3 for Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Figure 4 for Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Viaarxiv icon

PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

Add code
Feb 12, 2024
Figure 1 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Figure 2 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Figure 3 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Figure 4 for PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Viaarxiv icon

AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

Add code
Jan 23, 2024
Viaarxiv icon