Picture for Qihang Fan

Qihang Fan

Breaking the Low-Rank Dilemma of Linear Attention

Add code
Nov 14, 2024
Viaarxiv icon

Vision Transformer with Sparse Scan Prior

Add code
May 22, 2024
Viaarxiv icon

Semantic Equitable Clustering: A Simple, Fast and Effective Strategy for Vision Transformer

Add code
May 22, 2024
Viaarxiv icon

Band-Attention Modulated RetNet for Face Forgery Detection

Add code
Apr 09, 2024
Viaarxiv icon

ViTAR: Vision Transformer with Any Resolution

Add code
Mar 28, 2024
Viaarxiv icon

Video-Teller: Enhancing Cross-Modal Generation with Fusion and Decoupling

Add code
Oct 11, 2023
Viaarxiv icon

Video-CSR: Complex Video Digest Creation for Visual-Language Models

Add code
Oct 08, 2023
Figure 1 for Video-CSR: Complex Video Digest Creation for Visual-Language Models
Figure 2 for Video-CSR: Complex Video Digest Creation for Visual-Language Models
Figure 3 for Video-CSR: Complex Video Digest Creation for Visual-Language Models
Figure 4 for Video-CSR: Complex Video Digest Creation for Visual-Language Models
Viaarxiv icon

RMT: Retentive Networks Meet Vision Transformers

Add code
Sep 20, 2023
Viaarxiv icon

Lightweight Vision Transformer with Bidirectional Interaction

Add code
Jun 01, 2023
Viaarxiv icon

Rethinking Local Perception in Lightweight Vision Transformer

Add code
Apr 03, 2023
Viaarxiv icon