Picture for Yongjie Ye

Yongjie Ye

Dynamic-VLM: Simple Dynamic Visual Token Compression for VideoLLM

Add code
Dec 12, 2024
Viaarxiv icon

A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding

Add code
Jul 02, 2024
Viaarxiv icon

TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

Add code
Jun 03, 2024
Viaarxiv icon

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

Add code
May 20, 2024
Viaarxiv icon

Elysium: Exploring Object-level Perception in Videos via MLLM

Add code
Mar 29, 2024
Viaarxiv icon