Picture for Jian Luan

Jian Luan

Video-OPD: Efficient Post-Training of Multimodal Large Language Models for Temporal Video Grounding via On-Policy Distillation

Add code
Feb 03, 2026
Viaarxiv icon

Restoring Exploration after Post-Training: Latent Exploration Decoding for Large Reasoning Models

Add code
Feb 02, 2026
Viaarxiv icon

FutureMind: Equipping Small Language Models with Strategic Thinking-Pattern Priors via Adaptive Knowledge Distillation

Add code
Feb 01, 2026
Viaarxiv icon

MobileBench-OL: A Comprehensive Chinese Benchmark for Evaluating Mobile GUI Agents in Real-World Environment

Add code
Jan 29, 2026
Viaarxiv icon

R^3: Replay, Reflection, and Ranking Rewards for LLM Reinforcement Learning

Add code
Jan 27, 2026
Viaarxiv icon

GAIA: A Data Flywheel System for Training GUI Test-Time Scaling Critic Models

Add code
Jan 26, 2026
Viaarxiv icon

Federated Balanced Learning

Add code
Jan 20, 2026
Viaarxiv icon

Xiaomi MiMo-VL-Miloco Technical Report

Add code
Dec 22, 2025
Figure 1 for Xiaomi MiMo-VL-Miloco Technical Report
Figure 2 for Xiaomi MiMo-VL-Miloco Technical Report
Figure 3 for Xiaomi MiMo-VL-Miloco Technical Report
Figure 4 for Xiaomi MiMo-VL-Miloco Technical Report
Viaarxiv icon

STEP: Success-Rate-Aware Trajectory-Efficient Policy Optimization

Add code
Nov 17, 2025
Viaarxiv icon

REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding

Add code
Nov 17, 2025
Viaarxiv icon