Picture for Wentong Li

Wentong Li

OrderChain: A General Prompting Paradigm to Improve Ordinal Understanding Ability of MLLM

Add code
Apr 07, 2025
Viaarxiv icon

Uncertainty-Instructed Structure Injection for Generalizable HD Map Construction

Add code
Mar 29, 2025
Viaarxiv icon

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Add code
Jan 08, 2025
Viaarxiv icon

Scalable Autoregressive Monocular Depth Estimation

Add code
Nov 18, 2024
Viaarxiv icon

ReliOcc: Towards Reliable Semantic Occupancy Prediction via Uncertainty Learning

Add code
Sep 26, 2024
Viaarxiv icon

Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering

Add code
Jul 09, 2024
Viaarxiv icon

TokenPacker: Efficient Visual Projector for Multimodal LLM

Add code
Jul 02, 2024
Figure 1 for TokenPacker: Efficient Visual Projector for Multimodal LLM
Figure 2 for TokenPacker: Efficient Visual Projector for Multimodal LLM
Figure 3 for TokenPacker: Efficient Visual Projector for Multimodal LLM
Figure 4 for TokenPacker: Efficient Visual Projector for Multimodal LLM
Viaarxiv icon

Label-efficient Semantic Scene Completion with Scribble Annotations

Add code
May 24, 2024
Figure 1 for Label-efficient Semantic Scene Completion with Scribble Annotations
Figure 2 for Label-efficient Semantic Scene Completion with Scribble Annotations
Figure 3 for Label-efficient Semantic Scene Completion with Scribble Annotations
Figure 4 for Label-efficient Semantic Scene Completion with Scribble Annotations
Viaarxiv icon

Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation

Add code
Apr 18, 2024
Viaarxiv icon

MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction

Add code
Apr 01, 2024
Viaarxiv icon