Picture for Yao Lu

Yao Lu

CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models

Add code
Mar 27, 2025
Viaarxiv icon

Scaling Vision Pre-Training to 4K Resolution

Add code
Mar 25, 2025
Viaarxiv icon

PP-DocBee: Improving Multimodal Document Understanding Through a Bag of Tricks

Add code
Mar 06, 2025
Viaarxiv icon

Token-Efficient Long Video Understanding for Multimodal LLMs

Add code
Mar 06, 2025
Viaarxiv icon

OkraLong: A Flexible Retrieval-Augmented Framework for Long-Text Query Processing

Add code
Mar 05, 2025
Viaarxiv icon

WorldModelBench: Judging Video Generation Models As World Models

Add code
Feb 28, 2025
Viaarxiv icon

MCLRL: A Multi-Domain Contrastive Learning with Reinforcement Learning Framework for Few-Shot Modulation Recognition

Add code
Feb 26, 2025
Viaarxiv icon

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Add code
Feb 20, 2025
Viaarxiv icon

Multilingual Language Model Pretraining using Machine-translated Data

Add code
Feb 18, 2025
Viaarxiv icon

Beyond Window-Based Detection: A Graph-Centric Framework for Discrete Log Anomaly Detection

Add code
Jan 21, 2025
Viaarxiv icon