Picture for Jianjie Fang

Jianjie Fang

The Point, the Vision and the Text: Does Point Cloud Boost Spatial Reasoning of Large Language Models?

Add code
Apr 06, 2025
Viaarxiv icon

UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces

Add code
Mar 08, 2025
Viaarxiv icon

EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment

Add code
Oct 12, 2024
Viaarxiv icon