Picture for Jintao Lin

Jintao Lin

V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding

Add code
Dec 12, 2024
Viaarxiv icon

VLG: General Video Recognition with Web Textual Knowledge

Add code
Dec 03, 2022
Viaarxiv icon

OCSampler: Compressing Videos to One Clip with Single-step Sampling

Add code
Jan 12, 2022
Figure 1 for OCSampler: Compressing Videos to One Clip with Single-step Sampling
Figure 2 for OCSampler: Compressing Videos to One Clip with Single-step Sampling
Figure 3 for OCSampler: Compressing Videos to One Clip with Single-step Sampling
Figure 4 for OCSampler: Compressing Videos to One Clip with Single-step Sampling
Viaarxiv icon