Picture for Zeming Li

Zeming Li

ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation?

Add code
Dec 13, 2024
Viaarxiv icon

Multi-modal Relation Distillation for Unified 3D Representation Learning

Add code
Jul 19, 2024
Figure 1 for Multi-modal Relation Distillation for Unified 3D Representation Learning
Figure 2 for Multi-modal Relation Distillation for Unified 3D Representation Learning
Figure 3 for Multi-modal Relation Distillation for Unified 3D Representation Learning
Figure 4 for Multi-modal Relation Distillation for Unified 3D Representation Learning
Viaarxiv icon

4K4DGen: Panoramic 4D Generation at 4K Resolution

Add code
Jun 19, 2024
Figure 1 for 4K4DGen: Panoramic 4D Generation at 4K Resolution
Figure 2 for 4K4DGen: Panoramic 4D Generation at 4K Resolution
Figure 3 for 4K4DGen: Panoramic 4D Generation at 4K Resolution
Figure 4 for 4K4DGen: Panoramic 4D Generation at 4K Resolution
Viaarxiv icon

HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors

Add code
Jun 18, 2024
Viaarxiv icon

Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training

Add code
Apr 18, 2024
Figure 1 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Figure 2 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Figure 3 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Figure 4 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Viaarxiv icon

HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations

Add code
Mar 06, 2024
Figure 1 for HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Figure 2 for HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Figure 3 for HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Figure 4 for HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations
Viaarxiv icon

GMM: Delving into Gradient Aware and Model Perceive Depth Mining for Monocular 3D Detection

Add code
Jun 30, 2023
Viaarxiv icon

Dynamic Grained Encoder for Vision Transformers

Add code
Jan 10, 2023
Viaarxiv icon

MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception

Add code
Nov 19, 2022
Viaarxiv icon

BEVStereo: Enhancing Depth Estimation in Multi-view 3D Object Detection with Dynamic Temporal Stereo

Add code
Sep 21, 2022
Figure 1 for BEVStereo: Enhancing Depth Estimation in Multi-view 3D Object Detection with Dynamic Temporal Stereo
Figure 2 for BEVStereo: Enhancing Depth Estimation in Multi-view 3D Object Detection with Dynamic Temporal Stereo
Figure 3 for BEVStereo: Enhancing Depth Estimation in Multi-view 3D Object Detection with Dynamic Temporal Stereo
Figure 4 for BEVStereo: Enhancing Depth Estimation in Multi-view 3D Object Detection with Dynamic Temporal Stereo
Viaarxiv icon