Picture for Shen Yan

Shen Yan

LoD-Loc: Aerial Visual Localization using LoD 3D Map with Neural Wireframe Alignment

Add code
Oct 16, 2024
Viaarxiv icon

Streaming Dense Video Captioning

Add code
Apr 01, 2024
Viaarxiv icon

VideoPrism: A Foundational Visual Encoder for Video Understanding

Add code
Feb 20, 2024
Figure 1 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 2 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 3 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Figure 4 for VideoPrism: A Foundational Visual Encoder for Video Understanding
Viaarxiv icon

PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter

Add code
Feb 16, 2024
Viaarxiv icon

UAVD4L: A Large-Scale Dataset for UAV 6-DoF Localization

Add code
Jan 11, 2024
Viaarxiv icon

Efficient Large Language Models: A Survey

Add code
Dec 23, 2023
Viaarxiv icon

Pixel Aligned Language Models

Add code
Dec 14, 2023
Viaarxiv icon

UnLoc: A Unified Framework for Video Localization Tasks

Add code
Aug 21, 2023
Viaarxiv icon

AutoTaskFormer: Searching Vision Transformers for Multi-task Learning

Add code
Apr 20, 2023
Viaarxiv icon

Long-term Visual Localization with Mobile Sensors

Add code
Apr 16, 2023
Viaarxiv icon