Picture for Lingchen Meng

Lingchen Meng

Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning

Add code
Dec 04, 2024
Figure 1 for Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning
Figure 2 for Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning
Figure 3 for Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning
Figure 4 for Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning
Viaarxiv icon

V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results

Add code
Jun 17, 2024
Figure 1 for V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results
Figure 2 for V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results
Figure 3 for V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results
Viaarxiv icon

DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs

Add code
Jun 06, 2024
Viaarxiv icon

To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning

Add code
Nov 29, 2023
Viaarxiv icon

SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation

Add code
Nov 24, 2023
Viaarxiv icon

Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection

Add code
Oct 18, 2023
Viaarxiv icon

Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding

Add code
Jun 07, 2022
Figure 1 for Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Figure 2 for Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Figure 3 for Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Figure 4 for Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Viaarxiv icon

AdaViT: Adaptive Vision Transformers for Efficient Image Recognition

Add code
Nov 30, 2021
Figure 1 for AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
Figure 2 for AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
Figure 3 for AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
Figure 4 for AdaViT: Adaptive Vision Transformers for Efficient Image Recognition
Viaarxiv icon

A Multimodal Framework for Video Ads Understanding

Add code
Aug 29, 2021
Figure 1 for A Multimodal Framework for Video Ads Understanding
Figure 2 for A Multimodal Framework for Video Ads Understanding
Figure 3 for A Multimodal Framework for Video Ads Understanding
Figure 4 for A Multimodal Framework for Video Ads Understanding
Viaarxiv icon