Picture for Zhen Li

Zhen Li

LMO, CELESTE, HEC Paris

Deep learning for 3D point cloud processing -- from approaches, tasks to its implications on urban and environmental applications

Add code
Sep 15, 2025
Viaarxiv icon

MDK12-Bench: A Comprehensive Evaluation of Multimodal Large Language Models on Multidisciplinary Exams

Add code
Aug 09, 2025
Viaarxiv icon

Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning

Add code
Aug 01, 2025
Viaarxiv icon

T2VParser: Adaptive Decomposition Tokens for Partial Alignment in Text to Video Retrieval

Add code
Jul 28, 2025
Viaarxiv icon

Yume: An Interactive World Generation Model

Add code
Jul 23, 2025
Viaarxiv icon

Lumina-mGPT 2.0: Stand-Alone AutoRegressive Image Modeling

Add code
Jul 23, 2025
Viaarxiv icon

Bradley-Terry and Multi-Objective Reward Modeling Are Complementary

Add code
Jul 10, 2025
Viaarxiv icon

SkyVLN: Vision-and-Language Navigation and NMPC Control for UAVs in Urban Environments

Add code
Jul 09, 2025
Viaarxiv icon

Sekai: A Video Dataset towards World Exploration

Add code
Jun 18, 2025
Viaarxiv icon

RelTopo: Enhancing Relational Modeling for Driving Scene Topology Reasoning

Add code
Jun 16, 2025
Viaarxiv icon