Picture for Size Wu

Size Wu

F-LMM: Grounding Frozen Large Multimodal Models

Add code
Jun 09, 2024
Figure 1 for F-LMM: Grounding Frozen Large Multimodal Models
Figure 2 for F-LMM: Grounding Frozen Large Multimodal Models
Figure 3 for F-LMM: Grounding Frozen Large Multimodal Models
Figure 4 for F-LMM: Grounding Frozen Large Multimodal Models
Viaarxiv icon

OMG-Seg: Is One Model Good Enough For All Segmentation?

Add code
Jan 18, 2024
Figure 1 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Figure 2 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Figure 3 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Figure 4 for OMG-Seg: Is One Model Good Enough For All Segmentation?
Viaarxiv icon

CLIM: Contrastive Language-Image Mosaic for Region Representation

Add code
Dec 19, 2023
Viaarxiv icon

CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

Add code
Oct 02, 2023
Figure 1 for CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
Figure 2 for CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
Figure 3 for CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
Figure 4 for CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
Viaarxiv icon

DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection

Add code
Oct 02, 2023
Figure 1 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Figure 2 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Figure 3 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Figure 4 for DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
Viaarxiv icon

Aligning Bag of Regions for Open-Vocabulary Object Detection

Add code
Feb 27, 2023
Viaarxiv icon

Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images

Add code
Sep 13, 2021
Figure 1 for Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
Figure 2 for Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
Figure 3 for Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
Figure 4 for Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
Viaarxiv icon