Picture for Shusheng Yang

Shusheng Yang

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Add code
Jun 24, 2024
Figure 1 for Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Figure 2 for Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Figure 3 for Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Figure 4 for Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Viaarxiv icon

Qwen Technical Report

Add code
Sep 28, 2023
Figure 1 for Qwen Technical Report
Figure 2 for Qwen Technical Report
Figure 3 for Qwen Technical Report
Figure 4 for Qwen Technical Report
Viaarxiv icon

Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond

Add code
Sep 14, 2023
Viaarxiv icon

TouchStone: Evaluating Vision-Language Models by Language Models

Add code
Sep 04, 2023
Viaarxiv icon

ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers

Add code
May 24, 2023
Viaarxiv icon

MobileInst: Video Instance Segmentation on the Mobile

Add code
Mar 30, 2023
Viaarxiv icon

Masked Visual Reconstruction in Language Semantic Space

Add code
Jan 17, 2023
Viaarxiv icon

Masked Image Modeling with Denoising Contrast

Add code
May 19, 2022
Figure 1 for Masked Image Modeling with Denoising Contrast
Figure 2 for Masked Image Modeling with Denoising Contrast
Figure 3 for Masked Image Modeling with Denoising Contrast
Figure 4 for Masked Image Modeling with Denoising Contrast
Viaarxiv icon

Temporally Efficient Vision Transformer for Video Instance Segmentation

Add code
Apr 18, 2022
Figure 1 for Temporally Efficient Vision Transformer for Video Instance Segmentation
Figure 2 for Temporally Efficient Vision Transformer for Video Instance Segmentation
Figure 3 for Temporally Efficient Vision Transformer for Video Instance Segmentation
Figure 4 for Temporally Efficient Vision Transformer for Video Instance Segmentation
Viaarxiv icon

Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection

Add code
Apr 06, 2022
Figure 1 for Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Figure 2 for Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Figure 3 for Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Figure 4 for Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Viaarxiv icon