Picture for Yunhao Gou

Yunhao Gou

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Add code
Sep 26, 2024
Figure 1 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 2 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 3 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 4 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Viaarxiv icon

Mixture of insighTful Experts : The Synergy of Thought Chains and Expert Mixtures in Self-Alignment

Add code
May 01, 2024
Viaarxiv icon

Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation

Add code
Mar 22, 2024
Viaarxiv icon

Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning

Add code
Dec 19, 2023
Viaarxiv icon

Leveraging per Image-Token Consistency for Vision-Language Pre-training

Add code
Nov 20, 2022
Viaarxiv icon

Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification

Add code
Mar 02, 2022
Figure 1 for Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification
Figure 2 for Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification
Figure 3 for Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification
Figure 4 for Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification
Viaarxiv icon

Region Semantically Aligned Network for Zero-Shot Learning

Add code
Oct 14, 2021
Figure 1 for Region Semantically Aligned Network for Zero-Shot Learning
Figure 2 for Region Semantically Aligned Network for Zero-Shot Learning
Figure 3 for Region Semantically Aligned Network for Zero-Shot Learning
Figure 4 for Region Semantically Aligned Network for Zero-Shot Learning
Viaarxiv icon