Picture for Minjie Zhu

Minjie Zhu

Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression

Add code
Dec 04, 2024
Viaarxiv icon

Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation

Add code
Sep 22, 2024
Viaarxiv icon

MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?

Add code
Jun 28, 2024
Figure 1 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Figure 2 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Figure 3 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Figure 4 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Viaarxiv icon

Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models

Add code
Mar 15, 2024
Viaarxiv icon

Language-Conditioned Robotic Manipulation with Fast and Slow Thinking

Add code
Feb 01, 2024
Viaarxiv icon

LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language Model

Add code
Jan 15, 2024
Viaarxiv icon

Object-Centric Instruction Augmentation for Robotic Manipulation

Add code
Jan 05, 2024
Viaarxiv icon

SpeechAct: Towards Generating Whole-body Motion from Speech

Add code
Nov 29, 2023
Viaarxiv icon

Enhancing Event Sequence Modeling with Contrastive Relational Inference

Add code
Sep 06, 2023
Figure 1 for Enhancing Event Sequence Modeling with Contrastive Relational Inference
Figure 2 for Enhancing Event Sequence Modeling with Contrastive Relational Inference
Figure 3 for Enhancing Event Sequence Modeling with Contrastive Relational Inference
Figure 4 for Enhancing Event Sequence Modeling with Contrastive Relational Inference
Viaarxiv icon