Picture for Weidi Xie

Weidi Xie

LoRKD: Low-Rank Knowledge Decomposition for Medical Foundation Models

Add code
Sep 29, 2024
Viaarxiv icon

Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos

Add code
Aug 26, 2024
Figure 1 for Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
Figure 2 for Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
Figure 3 for Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
Figure 4 for Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
Viaarxiv icon

Can Visual Foundation Models Achieve Long-term Point Tracking?

Add code
Aug 24, 2024
Figure 1 for Can Visual Foundation Models Achieve Long-term Point Tracking?
Figure 2 for Can Visual Foundation Models Achieve Long-term Point Tracking?
Figure 3 for Can Visual Foundation Models Achieve Long-term Point Tracking?
Figure 4 for Can Visual Foundation Models Achieve Long-term Point Tracking?
Viaarxiv icon

Towards Evaluating and Building Versatile Large Language Models for Medicine

Add code
Aug 22, 2024
Viaarxiv icon

AutoRG-Brain: Grounded Report Generation for Brain MRI

Add code
Jul 26, 2024
Viaarxiv icon

AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description

Add code
Jul 22, 2024
Viaarxiv icon

EchoSight: Advancing Visual-Language Models with Wiki Knowledge

Add code
Jul 17, 2024
Viaarxiv icon

VISA: Reasoning Video Object Segmentation via Large Language Models

Add code
Jul 16, 2024
Figure 1 for VISA: Reasoning Video Object Segmentation via Large Language Models
Figure 2 for VISA: Reasoning Video Object Segmentation via Large Language Models
Figure 3 for VISA: Reasoning Video Object Segmentation via Large Language Models
Figure 4 for VISA: Reasoning Video Object Segmentation via Large Language Models
Viaarxiv icon

A Sanity Check for AI-generated Image Detection

Add code
Jun 27, 2024
Viaarxiv icon

MatchTime: Towards Automatic Soccer Game Commentary Generation

Add code
Jun 26, 2024
Viaarxiv icon