Picture for Ruimao Zhang

Ruimao Zhang

RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints

Add code
Mar 20, 2025
Viaarxiv icon

DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation

Add code
Mar 14, 2025
Viaarxiv icon

Semantic-Supervised Spatial-Temporal Fusion for LiDAR-based 3D Object Detection

Add code
Mar 13, 2025
Viaarxiv icon

Unlock the Power of Unlabeled Data in Language Driving Model

Add code
Mar 13, 2025
Viaarxiv icon

NavigateDiff: Visual Predictors are Zero-Shot Navigation Assistants

Add code
Feb 19, 2025
Viaarxiv icon

Motion-X++: A Large-Scale Multimodal 3D Whole-body Human Motion Dataset

Add code
Jan 09, 2025
Figure 1 for Motion-X++: A Large-Scale Multimodal 3D Whole-body Human Motion Dataset
Figure 2 for Motion-X++: A Large-Scale Multimodal 3D Whole-body Human Motion Dataset
Figure 3 for Motion-X++: A Large-Scale Multimodal 3D Whole-body Human Motion Dataset
Figure 4 for Motion-X++: A Large-Scale Multimodal 3D Whole-body Human Motion Dataset
Viaarxiv icon

ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning

Add code
Jan 08, 2025
Viaarxiv icon

ScaMo: Exploring the Scaling Law in Autoregressive Motion Generation Model

Add code
Dec 19, 2024
Viaarxiv icon

Ensuring Force Safety in Vision-Guided Robotic Manipulation via Implicit Tactile Calibration

Add code
Dec 13, 2024
Viaarxiv icon

KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension

Add code
Nov 04, 2024
Figure 1 for KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
Figure 2 for KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
Figure 3 for KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
Figure 4 for KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
Viaarxiv icon