Picture for Shuo Xin

Shuo Xin

OmniVLM: A Token-Compressed, Sub-Billion-Parameter Vision-Language Model for Efficient On-Device Inference

Add code
Dec 16, 2024
Viaarxiv icon

Visual Object Tracking across Diverse Data Modalities: A Review

Add code
Dec 13, 2024
Viaarxiv icon

Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Add code
Sep 03, 2024
Viaarxiv icon

SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking

Add code
Mar 28, 2024
Viaarxiv icon