Picture for Yong Li

Yong Li

Tsinghua University

The Point, the Vision and the Text: Does Point Cloud Boost Spatial Reasoning of Large Language Models?

Add code
Apr 06, 2025
Viaarxiv icon

CoSIL: Software Issue Localization via LLM-Driven Code Repository Graph Searching

Add code
Mar 28, 2025
Viaarxiv icon

Wan: Open and Advanced Large-Scale Video Generative Models

Add code
Mar 26, 2025
Viaarxiv icon

Open3DVQA: A Benchmark for Comprehensive Spatial Reasoning with Multimodal Large Language Model in Open Space

Add code
Mar 14, 2025
Viaarxiv icon

Beyond Overfitting: Doubly Adaptive Dropout for Generalizable AU Detection

Add code
Mar 12, 2025
Viaarxiv icon

Decoupled Doubly Contrastive Learning for Cross Domain Facial Action Unit Detection

Add code
Mar 12, 2025
Viaarxiv icon

EPR-GAIL: An EPR-Enhanced Hierarchical Imitation Learning Framework to Simulate Complex User Consumption Behaviors

Add code
Mar 09, 2025
Viaarxiv icon

Causal Discovery and Inference towards Urban Elements and Associated Factors

Add code
Mar 09, 2025
Viaarxiv icon

Causality Enhanced Origin-Destination Flow Prediction in Data-Scarce Cities

Add code
Mar 09, 2025
Viaarxiv icon

UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces

Add code
Mar 08, 2025
Viaarxiv icon