Picture for Siwen Luo

Siwen Luo

Multimodal Commonsense Knowledge Distillation for Visual Question Answering

Add code
Nov 05, 2024
Viaarxiv icon

'No' Matters: Out-of-Distribution Detection in Multimodality Long Dialogue

Add code
Oct 31, 2024
Viaarxiv icon

3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection

Add code
Jul 12, 2024
Viaarxiv icon

PDF-MVQA: A Dataset for Multimodal Information Retrieval in-based Visual Question Answering

Add code
Apr 19, 2024
Viaarxiv icon

Workshop on Document Intelligence Understanding

Add code
Jul 31, 2023
Figure 1 for Workshop on Document Intelligence Understanding
Figure 2 for Workshop on Document Intelligence Understanding
Figure 3 for Workshop on Document Intelligence Understanding
Viaarxiv icon

PDFVQA: A New Dataset for Real-World VQA on Documents

Add code
Apr 24, 2023
Figure 1 for PDFVQA: A New Dataset for Real-World VQA on Documents
Figure 2 for PDFVQA: A New Dataset for Real-World VQA on Documents
Figure 3 for PDFVQA: A New Dataset for Real-World VQA on Documents
Figure 4 for PDFVQA: A New Dataset for Real-World VQA on Documents
Viaarxiv icon

SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering

Add code
Dec 16, 2022
Figure 1 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Figure 2 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Figure 3 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Figure 4 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Viaarxiv icon

PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals

Add code
Dec 01, 2022
Figure 1 for PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals
Figure 2 for PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals
Figure 3 for PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals
Viaarxiv icon

Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis

Add code
Aug 22, 2022
Figure 1 for Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis
Figure 2 for Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis
Figure 3 for Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis
Figure 4 for Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis
Viaarxiv icon

Local Interpretations for Explainable Natural Language Processing: A Survey

Add code
Mar 20, 2021
Figure 1 for Local Interpretations for Explainable Natural Language Processing: A Survey
Figure 2 for Local Interpretations for Explainable Natural Language Processing: A Survey
Figure 3 for Local Interpretations for Explainable Natural Language Processing: A Survey
Figure 4 for Local Interpretations for Explainable Natural Language Processing: A Survey
Viaarxiv icon