Picture for Runhui Huang

Runhui Huang

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Add code
Sep 26, 2024
Figure 1 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 2 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 3 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 4 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Viaarxiv icon

HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

Add code
Jul 11, 2024
Viaarxiv icon

LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model

Add code
Mar 18, 2024
Viaarxiv icon

GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training

Add code
Aug 22, 2023
Viaarxiv icon

DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability

Add code
Aug 18, 2023
Viaarxiv icon

UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning

Add code
Jun 01, 2023
Viaarxiv icon

Boosting Visual-Language Models by Exploiting Hard Samples

Add code
May 09, 2023
Viaarxiv icon

NLIP: Noise-robust Language-Image Pre-training

Add code
Jan 04, 2023
Viaarxiv icon

P$^3$OVD: Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary Object Detection

Add code
Nov 02, 2022
Viaarxiv icon

Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework

Add code
Mar 10, 2022
Figure 1 for Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework
Figure 2 for Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework
Figure 3 for Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework
Figure 4 for Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework
Viaarxiv icon