Picture for Karmesh Yadav

Karmesh Yadav

Let's Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification

Add code
Jul 15, 2025
Viaarxiv icon

FindingDory: A Benchmark to Evaluate Memory in Embodied Agents

Add code
Jun 18, 2025
Viaarxiv icon

Magistral

Add code
Jun 12, 2025
Viaarxiv icon

ReLIC: A Recipe for 64k Steps of In-Context Reinforcement Learning for Embodied AI

Add code
Oct 03, 2024
Viaarxiv icon

Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge

Add code
Jul 09, 2024
Figure 1 for Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
Figure 2 for Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
Figure 3 for Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
Figure 4 for Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
Viaarxiv icon

Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

Add code
May 09, 2024
Figure 1 for Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control
Figure 2 for Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control
Figure 3 for Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control
Figure 4 for Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control
Viaarxiv icon

What do we learn from a large-scale study of pre-trained visual representations in sim and real environments?

Add code
Oct 03, 2023
Figure 1 for What do we learn from a large-scale study of pre-trained visual representations in sim and real environments?
Figure 2 for What do we learn from a large-scale study of pre-trained visual representations in sim and real environments?
Figure 3 for What do we learn from a large-scale study of pre-trained visual representations in sim and real environments?
Figure 4 for What do we learn from a large-scale study of pre-trained visual representations in sim and real environments?
Viaarxiv icon

HomeRobot: Open-Vocabulary Mobile Manipulation

Add code
Jun 20, 2023
Figure 1 for HomeRobot: Open-Vocabulary Mobile Manipulation
Figure 2 for HomeRobot: Open-Vocabulary Mobile Manipulation
Figure 3 for HomeRobot: Open-Vocabulary Mobile Manipulation
Figure 4 for HomeRobot: Open-Vocabulary Mobile Manipulation
Viaarxiv icon

Navigating to Objects Specified by Images

Add code
Apr 03, 2023
Figure 1 for Navigating to Objects Specified by Images
Figure 2 for Navigating to Objects Specified by Images
Figure 3 for Navigating to Objects Specified by Images
Figure 4 for Navigating to Objects Specified by Images
Viaarxiv icon

Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?

Add code
Mar 31, 2023
Viaarxiv icon