Picture for Soyeon Caren Han

Soyeon Caren Han

Multimodal Commonsense Knowledge Distillation for Visual Question Answering

Add code
Nov 05, 2024
Viaarxiv icon

TriG-NER: Triplet-Grid Framework for Discontinuous Named Entity Recognition

Add code
Nov 04, 2024
Viaarxiv icon

Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and Beyond

Add code
Oct 08, 2024
Viaarxiv icon

DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights

Add code
Oct 02, 2024
Figure 1 for DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights
Figure 2 for DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights
Figure 3 for DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights
Figure 4 for DAViD: Domain Adaptive Visually-Rich Document Understanding with Synthetic Insights
Viaarxiv icon

MIDAS: Multi-level Intent, Domain, And Slot Knowledge Distillation for Multi-turn NLU

Add code
Aug 15, 2024
Viaarxiv icon

MSG-Chart: Multimodal Scene Graph for ChartQA

Add code
Aug 09, 2024
Viaarxiv icon

Deep Learning based Visually Rich Document Content Understanding: A Survey

Add code
Aug 02, 2024
Viaarxiv icon

3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection

Add code
Jul 12, 2024
Viaarxiv icon

3M: Multi-modal Multi-task Multi-teacher Learning for Game Event Detection

Add code
Jun 13, 2024
Figure 1 for 3M: Multi-modal Multi-task Multi-teacher Learning for Game Event Detection
Figure 2 for 3M: Multi-modal Multi-task Multi-teacher Learning for Game Event Detection
Figure 3 for 3M: Multi-modal Multi-task Multi-teacher Learning for Game Event Detection
Figure 4 for 3M: Multi-modal Multi-task Multi-teacher Learning for Game Event Detection
Viaarxiv icon

PDF-MVQA: A Dataset for Multimodal Information Retrieval in-based Visual Question Answering

Add code
Apr 19, 2024
Viaarxiv icon