Picture for Mengyang Sun

Mengyang Sun

Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models

Add code
Dec 09, 2024
Viaarxiv icon

Visual Prompt Selection for In-Context Learning Segmentation

Add code
Jul 14, 2024
Viaarxiv icon

S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning

Add code
Sep 05, 2023
Viaarxiv icon

Dual-Level Decoupled Transformer for Video Captioning

Add code
May 06, 2022
Figure 1 for Dual-Level Decoupled Transformer for Video Captioning
Figure 2 for Dual-Level Decoupled Transformer for Video Captioning
Figure 3 for Dual-Level Decoupled Transformer for Video Captioning
Figure 4 for Dual-Level Decoupled Transformer for Video Captioning
Viaarxiv icon

Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention

Add code
May 05, 2021
Figure 1 for Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention
Figure 2 for Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention
Figure 3 for Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention
Figure 4 for Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention
Viaarxiv icon