Picture for Humphrey Shi

Humphrey Shi

OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation

Add code
Dec 12, 2024
Viaarxiv icon

GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models

Add code
Aug 29, 2024
Viaarxiv icon

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Add code
Aug 28, 2024
Figure 1 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 2 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 3 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 4 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Viaarxiv icon

Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation

Add code
Aug 01, 2024
Figure 1 for Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
Figure 2 for Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
Figure 3 for Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
Figure 4 for Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation
Viaarxiv icon

Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis

Add code
Jun 06, 2024
Figure 1 for Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Figure 2 for Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Figure 3 for Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Figure 4 for Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Viaarxiv icon

Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment

Add code
Jun 06, 2024
Figure 1 for Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment
Figure 2 for Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment
Figure 3 for Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment
Figure 4 for Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment
Viaarxiv icon

CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

Add code
May 09, 2024
Figure 1 for CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
Figure 2 for CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
Figure 3 for CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
Figure 4 for CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
Viaarxiv icon

UVMap-ID: A Controllable and Personalized UV Map Generative Model

Add code
Apr 22, 2024
Viaarxiv icon

OpenBias: Open-set Bias Detection in Text-to-Image Generative Models

Add code
Apr 11, 2024
Figure 1 for OpenBias: Open-set Bias Detection in Text-to-Image Generative Models
Figure 2 for OpenBias: Open-set Bias Detection in Text-to-Image Generative Models
Figure 3 for OpenBias: Open-set Bias Detection in Text-to-Image Generative Models
Figure 4 for OpenBias: Open-set Bias Detection in Text-to-Image Generative Models
Viaarxiv icon

Learning Trimaps via Clicks for Image Matting

Add code
Apr 06, 2024
Viaarxiv icon