Picture for Kaijun Tan

Kaijun Tan

M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?

Add code
Mar 27, 2025
Viaarxiv icon

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Add code
Mar 14, 2025
Viaarxiv icon

CheapNET: Improving Light-weight speech enhancement network by projected loss function

Add code
Nov 27, 2023
Viaarxiv icon