Picture for Nian Xie

Nian Xie

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Add code
Sep 26, 2024
Figure 1 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 2 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 3 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 4 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Viaarxiv icon

DCQA: Document-Level Chart Question Answering towards Complex Reasoning and Common-Sense Understanding

Add code
Oct 29, 2023
Viaarxiv icon

Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding

Add code
Dec 19, 2022
Viaarxiv icon