Picture for Shanghang Zhang

Shanghang Zhang

MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders

Add code
Jan 03, 2025
Viaarxiv icon

SCBench: A Sports Commentary Benchmark for Video LLMs

Add code
Dec 23, 2024
Viaarxiv icon

RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation

Add code
Dec 18, 2024
Viaarxiv icon

GaussianAD: Gaussian-Centric End-to-End Autonomous Driving

Add code
Dec 13, 2024
Viaarxiv icon

GPD-1: Generative Pre-training for Driving

Add code
Dec 11, 2024
Figure 1 for GPD-1: Generative Pre-training for Driving
Figure 2 for GPD-1: Generative Pre-training for Driving
Figure 3 for GPD-1: Generative Pre-training for Driving
Figure 4 for GPD-1: Generative Pre-training for Driving
Viaarxiv icon

ASGDiffusion: Parallel High-Resolution Generation with Asynchronous Structure Guidance

Add code
Dec 09, 2024
Viaarxiv icon

Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model

Add code
Dec 06, 2024
Viaarxiv icon

[CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster

Add code
Dec 02, 2024
Viaarxiv icon

Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective

Add code
Nov 27, 2024
Figure 1 for Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
Figure 2 for Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
Figure 3 for Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
Figure 4 for Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
Viaarxiv icon

Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation

Add code
Nov 27, 2024
Figure 1 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 2 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 3 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Figure 4 for Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Viaarxiv icon