Picture for Zhigang Wang

Zhigang Wang

AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations

Add code
Apr 10, 2025
Viaarxiv icon

Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation

Add code
Apr 01, 2025
Viaarxiv icon

MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation

Add code
Mar 14, 2025
Viaarxiv icon

Multi-Grained Feature Pruning for Video-Based Human Pose Estimation

Add code
Mar 07, 2025
Viaarxiv icon

OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation

Add code
Feb 25, 2025
Viaarxiv icon

Exploring the Potential of Encoder-free Architectures in 3D LMMs

Add code
Feb 13, 2025
Figure 1 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 2 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 3 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Figure 4 for Exploring the Potential of Encoder-free Architectures in 3D LMMs
Viaarxiv icon

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model

Add code
Jan 27, 2025
Figure 1 for SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model
Figure 2 for SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model
Figure 3 for SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model
Figure 4 for SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model
Viaarxiv icon

SpatioTemporal Learning for Human Pose Estimation in Sparsely-Labeled Videos

Add code
Jan 25, 2025
Viaarxiv icon

Optimizing Human Pose Estimation Through Focused Human and Joint Regions

Add code
Jan 24, 2025
Figure 1 for Optimizing Human Pose Estimation Through Focused Human and Joint Regions
Figure 2 for Optimizing Human Pose Estimation Through Focused Human and Joint Regions
Figure 3 for Optimizing Human Pose Estimation Through Focused Human and Joint Regions
Figure 4 for Optimizing Human Pose Estimation Through Focused Human and Joint Regions
Viaarxiv icon

Causal-Inspired Multitask Learning for Video-Based Human Pose Estimation

Add code
Jan 24, 2025
Viaarxiv icon