Picture for Weichen Zhang

Weichen Zhang

UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces

Add code
Mar 08, 2025
Viaarxiv icon

Understanding and Evaluating Hallucinations in 3D Visual Language Models

Add code
Feb 18, 2025
Viaarxiv icon

EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment

Add code
Oct 12, 2024
Viaarxiv icon

GUI Action Narrator: Where and When Did That Action Take Place?

Add code
Jun 19, 2024
Figure 1 for GUI Action Narrator: Where and When Did That Action Take Place?
Figure 2 for GUI Action Narrator: Where and When Did That Action Take Place?
Figure 3 for GUI Action Narrator: Where and When Did That Action Take Place?
Figure 4 for GUI Action Narrator: Where and When Did That Action Take Place?
Viaarxiv icon

ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation

Add code
Jan 01, 2024
Viaarxiv icon

MA-NeRF: Motion-Assisted Neural Radiance Fields for Face Synthesis from Sparse Images

Add code
Jun 24, 2023
Viaarxiv icon

Towards Arbitrary Text-driven Image Manipulation via Space Alignment

Add code
Jan 25, 2023
Viaarxiv icon

Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation

Add code
Aug 27, 2015
Figure 1 for Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation
Figure 2 for Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation
Figure 3 for Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation
Figure 4 for Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation
Viaarxiv icon