Picture for Xinlei Chen

Xinlei Chen

How to Enable LLM with 3D Capacity? A Survey of Spatial Reasoning in LLM

Add code
Apr 08, 2025
Viaarxiv icon

The Point, the Vision and the Text: Does Point Cloud Boost Spatial Reasoning of Large Language Models?

Add code
Apr 06, 2025
Viaarxiv icon

Scaling Language-Free Visual Representation Learning

Add code
Apr 01, 2025
Viaarxiv icon

Towards Mobile Sensing with Event Cameras on High-mobility Resource-constrained Devices: A Survey

Add code
Mar 29, 2025
Viaarxiv icon

Open3DVQA: A Benchmark for Comprehensive Spatial Reasoning with Multimodal Large Language Model in Open Space

Add code
Mar 14, 2025
Viaarxiv icon

Transformers without Normalization

Add code
Mar 13, 2025
Viaarxiv icon

Multi-Robot System for Cooperative Exploration in Unknown Environments: A Survey

Add code
Mar 10, 2025
Viaarxiv icon

UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces

Add code
Mar 08, 2025
Viaarxiv icon

Ultra-High-Frequency Harmony: mmWave Radar and Event Camera Orchestrate Accurate Drone Landing

Add code
Feb 20, 2025
Viaarxiv icon

PLPHP: Per-Layer Per-Head Vision Token Pruning for Efficient Large Vision-Language Models

Add code
Feb 20, 2025
Viaarxiv icon