Picture for Qilang Ye

Qilang Ye

EMO-LLaMA: Enhancing Facial Emotion Understanding with Instruction Tuning

Add code
Aug 21, 2024
Viaarxiv icon

Answering Diverse Questions via Text Attached with Key Audio-Visual Clues

Add code
Mar 11, 2024
Viaarxiv icon

CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios

Add code
Mar 07, 2024
Viaarxiv icon