Picture for Yunhang Shen

Yunhang Shen

Scale Contrastive Learning with Selective Attentions for Blind Image Quality Assessment

Add code
Nov 13, 2024
Viaarxiv icon

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Add code
Aug 09, 2024
Figure 1 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 2 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 3 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 4 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Viaarxiv icon

HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection

Add code
Jun 27, 2024
Viaarxiv icon

VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models

Add code
Jun 14, 2024
Viaarxiv icon

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Add code
May 31, 2024
Viaarxiv icon

Cantor: Inspiring Multimodal Chain-of-Thought of MLLM

Add code
Apr 24, 2024
Viaarxiv icon

Multi-Modal Prompt Learning on Blind Image Quality Assessment

Add code
Apr 23, 2024
Viaarxiv icon

Fusion-Mamba for Cross-modality Object Detection

Add code
Apr 14, 2024
Viaarxiv icon

A General and Efficient Training for Transformer via Token Expansion

Add code
Mar 31, 2024
Figure 1 for A General and Efficient Training for Transformer via Token Expansion
Figure 2 for A General and Efficient Training for Transformer via Token Expansion
Figure 3 for A General and Efficient Training for Transformer via Token Expansion
Figure 4 for A General and Efficient Training for Transformer via Token Expansion
Viaarxiv icon

Feature Denoising Diffusion Model for Blind Image Quality Assessment

Add code
Jan 22, 2024
Viaarxiv icon