Picture for Wayne Zhang

Wayne Zhang

Revisiting the Integration of Convolution and Attention for Vision Backbone

Add code
Nov 21, 2024
Viaarxiv icon

GeoGround: A Unified Large Vision-Language Model. for Remote Sensing Visual Grounding

Add code
Nov 16, 2024
Viaarxiv icon

Text4Seg: Reimagining Image Segmentation as Text Generation

Add code
Oct 13, 2024
Viaarxiv icon

ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation

Add code
Aug 09, 2024
Viaarxiv icon

ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference

Add code
Jul 17, 2024
Viaarxiv icon

Towards Vision-Language Geo-Foundation Model: A Survey

Add code
Jun 13, 2024
Figure 1 for Towards Vision-Language Geo-Foundation Model: A Survey
Figure 2 for Towards Vision-Language Geo-Foundation Model: A Survey
Figure 3 for Towards Vision-Language Geo-Foundation Model: A Survey
Figure 4 for Towards Vision-Language Geo-Foundation Model: A Survey
Viaarxiv icon

RelayAttention for Efficient Large Language Model Serving with Long System Prompts

Add code
Feb 29, 2024
Viaarxiv icon

Panoptic Video Scene Graph Generation

Add code
Nov 28, 2023
Viaarxiv icon

SmooSeg: Smoothness Prior for Unsupervised Semantic Segmentation

Add code
Oct 27, 2023
Viaarxiv icon

Diverse Cotraining Makes Strong Semi-Supervised Segmentor

Add code
Aug 18, 2023
Viaarxiv icon