Picture for Liangtao Shi

Liangtao Shi

SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding

Add code
Feb 24, 2025
Figure 1 for SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
Figure 2 for SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
Figure 3 for SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
Figure 4 for SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
Viaarxiv icon

Adaptive Perception for Unified Visual Multi-modal Object Tracking

Add code
Feb 10, 2025
Viaarxiv icon

Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model

Add code
Nov 16, 2024
Viaarxiv icon

Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference

Add code
May 23, 2024
Figure 1 for Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference
Figure 2 for Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference
Figure 3 for Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference
Figure 4 for Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference
Viaarxiv icon

Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers

Add code
Mar 15, 2024
Viaarxiv icon

Explicit Visual Prompts for Visual Object Tracking

Add code
Jan 06, 2024
Viaarxiv icon