Picture for Jinrong Yang

Jinrong Yang

VIRT: Vision Instructed Transformer for Robotic Manipulation

Add code
Oct 09, 2024
Figure 1 for VIRT: Vision Instructed Transformer for Robotic Manipulation
Figure 2 for VIRT: Vision Instructed Transformer for Robotic Manipulation
Figure 3 for VIRT: Vision Instructed Transformer for Robotic Manipulation
Figure 4 for VIRT: Vision Instructed Transformer for Robotic Manipulation
Viaarxiv icon

Self-supervised Pre-training for Transferable Multi-modal Perception

Add code
May 28, 2024
Viaarxiv icon

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Add code
Dec 11, 2023
Viaarxiv icon

Merlin:Empowering Multimodal LLMs with Foresight Minds

Add code
Nov 30, 2023
Viaarxiv icon

DreamLLM: Synergistic Multimodal Comprehension and Creation

Add code
Sep 20, 2023
Viaarxiv icon

ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning

Add code
Jul 18, 2023
Viaarxiv icon

GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping

Add code
Jul 18, 2023
Viaarxiv icon

GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection

Add code
Jun 30, 2023
Viaarxiv icon

BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo

Add code
Apr 09, 2023
Viaarxiv icon

Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception

Add code
Mar 13, 2023
Viaarxiv icon