Picture for Hanming Deng

Hanming Deng

EVA: Efficient Reinforcement Learning for End-to-End Video Agent

Add code
Mar 24, 2026
Viaarxiv icon

ACPO: Counteracting Likelihood Displacement in Vision-Language Alignment with Asymmetric Constraints

Add code
Mar 23, 2026
Viaarxiv icon

SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning

Add code
Dec 30, 2025
Viaarxiv icon

Scaling Spatial Intelligence with Multimodal Foundation Models

Add code
Nov 17, 2025
Figure 1 for Scaling Spatial Intelligence with Multimodal Foundation Models
Figure 2 for Scaling Spatial Intelligence with Multimodal Foundation Models
Figure 3 for Scaling Spatial Intelligence with Multimodal Foundation Models
Figure 4 for Scaling Spatial Intelligence with Multimodal Foundation Models
Viaarxiv icon

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

Add code
Oct 16, 2025
Viaarxiv icon

Has GPT-5 Achieved Spatial Intelligence? An Empirical Study

Add code
Aug 18, 2025
Figure 1 for Has GPT-5 Achieved Spatial Intelligence? An Empirical Study
Figure 2 for Has GPT-5 Achieved Spatial Intelligence? An Empirical Study
Figure 3 for Has GPT-5 Achieved Spatial Intelligence? An Empirical Study
Figure 4 for Has GPT-5 Achieved Spatial Intelligence? An Empirical Study
Viaarxiv icon

HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving

Add code
May 21, 2025
Figure 1 for HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving
Figure 2 for HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving
Figure 3 for HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving
Figure 4 for HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving
Viaarxiv icon

M2DA: Multi-Modal Fusion Transformer Incorporating Driver Attention for Autonomous Driving

Add code
Mar 19, 2024
Viaarxiv icon

DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving

Add code
Dec 25, 2023
Figure 1 for DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Figure 2 for DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Figure 3 for DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Figure 4 for DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Viaarxiv icon

Scene as Occupancy

Add code
Jun 06, 2023
Viaarxiv icon