Picture for Qiushan Guo

Qiushan Guo

SpatialRGPT: Grounded Spatial Reasoning in Vision Language Model

Add code
Jun 03, 2024
Viaarxiv icon

Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots

Add code
May 13, 2024
Figure 1 for Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots
Figure 2 for Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots
Figure 3 for Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots
Figure 4 for Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots
Viaarxiv icon

RegionGPT: Towards Region Understanding Vision Language Model

Add code
Mar 04, 2024
Figure 1 for RegionGPT: Towards Region Understanding Vision Language Model
Figure 2 for RegionGPT: Towards Region Understanding Vision Language Model
Figure 3 for RegionGPT: Towards Region Understanding Vision Language Model
Figure 4 for RegionGPT: Towards Region Understanding Vision Language Model
Viaarxiv icon

RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths

Add code
May 29, 2023
Figure 1 for RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths
Figure 2 for RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths
Figure 3 for RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths
Figure 4 for RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths
Viaarxiv icon

EGC: Image Generation and Classification via a Diffusion Energy-Based Model

Add code
Apr 13, 2023
Viaarxiv icon

Multi-Level Contrastive Learning for Dense Prediction Task

Add code
Apr 04, 2023
Viaarxiv icon

Rethinking Resolution in the Context of Efficient Video Recognition

Add code
Sep 26, 2022
Figure 1 for Rethinking Resolution in the Context of Efficient Video Recognition
Figure 2 for Rethinking Resolution in the Context of Efficient Video Recognition
Figure 3 for Rethinking Resolution in the Context of Efficient Video Recognition
Figure 4 for Rethinking Resolution in the Context of Efficient Video Recognition
Viaarxiv icon

Scale-Equivalent Distillation for Semi-Supervised Object Detection

Add code
Mar 26, 2022
Figure 1 for Scale-Equivalent Distillation for Semi-Supervised Object Detection
Figure 2 for Scale-Equivalent Distillation for Semi-Supervised Object Detection
Figure 3 for Scale-Equivalent Distillation for Semi-Supervised Object Detection
Figure 4 for Scale-Equivalent Distillation for Semi-Supervised Object Detection
Viaarxiv icon

MSFD:Multi-Scale Receptive Field Face Detector

Add code
Mar 11, 2019
Figure 1 for MSFD:Multi-Scale Receptive Field Face Detector
Figure 2 for MSFD:Multi-Scale Receptive Field Face Detector
Figure 3 for MSFD:Multi-Scale Receptive Field Face Detector
Figure 4 for MSFD:Multi-Scale Receptive Field Face Detector
Viaarxiv icon