Picture for Yudong Liu

Yudong Liu

When No Answer Is Correct: Diagnosing Absent Answer Detection for MLLMs in Video Understanding

Add code
Jun 06, 2026
Viaarxiv icon

Latent Bridge: Feature Delta Prediction for Efficient Dual-System Vision-Language-Action Model Inference

Add code
May 04, 2026
Viaarxiv icon

Query-Conditioned Evidential Keyframe Sampling for MLLM-Based Long-Form Video Understanding

Add code
Apr 01, 2026
Viaarxiv icon

LLaViDA: A Large Language Vision Driving Assistant for Explicit Reasoning and Enhanced Trajectory Planning

Add code
Dec 20, 2025
Figure 1 for LLaViDA: A Large Language Vision Driving Assistant for Explicit Reasoning and Enhanced Trajectory Planning
Figure 2 for LLaViDA: A Large Language Vision Driving Assistant for Explicit Reasoning and Enhanced Trajectory Planning
Figure 3 for LLaViDA: A Large Language Vision Driving Assistant for Explicit Reasoning and Enhanced Trajectory Planning
Figure 4 for LLaViDA: A Large Language Vision Driving Assistant for Explicit Reasoning and Enhanced Trajectory Planning
Viaarxiv icon

Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap

Add code
Sep 30, 2025
Viaarxiv icon

CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models

Add code
May 25, 2025
Viaarxiv icon

Seed1.5-VL Technical Report

Add code
May 11, 2025
Figure 1 for Seed1.5-VL Technical Report
Figure 2 for Seed1.5-VL Technical Report
Figure 3 for Seed1.5-VL Technical Report
Figure 4 for Seed1.5-VL Technical Report
Viaarxiv icon

Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing

Add code
Mar 13, 2025
Viaarxiv icon

SpeechPrune: Context-aware Token Pruning for Speech Information Retrieval

Add code
Dec 16, 2024
Viaarxiv icon

Towards Automated Model Design on Recommender Systems

Add code
Nov 12, 2024
Viaarxiv icon