Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

BiHui Yu

mChartQA: A universal benchmark for multimodal Chart Question Answer based on Vision-Language Alignment and Reasoning

Apr 02, 2024

Jingxuan Wei, Nan Xu, Guiyong Chang, Yin Luo, BiHui Yu, Ruifeng Guo

Figure 1 for mChartQA: A universal benchmark for multimodal Chart Question Answer based on Vision-Language Alignment and Reasoning

Figure 2 for mChartQA: A universal benchmark for multimodal Chart Question Answer based on Vision-Language Alignment and Reasoning

Figure 3 for mChartQA: A universal benchmark for multimodal Chart Question Answer based on Vision-Language Alignment and Reasoning

Figure 4 for mChartQA: A universal benchmark for multimodal Chart Question Answer based on Vision-Language Alignment and Reasoning

Abstract:In the fields of computer vision and natural language processing, multimodal chart question-answering, especially involving color, structure, and textless charts, poses significant challenges. Traditional methods, which typically involve either direct multimodal processing or a table-to-text conversion followed by language model analysis, have limitations in effectively handling these complex scenarios. This paper introduces a novel multimodal chart question-answering model, specifically designed to address these intricate tasks. Our model integrates visual and linguistic processing, overcoming the constraints of existing methods. We adopt a dual-phase training approach: the initial phase focuses on aligning image and text representations, while the subsequent phase concentrates on optimizing the model's interpretative and analytical abilities in chart-related queries. This approach has demonstrated superior performance on multiple public datasets, particularly in handling color, structure, and textless chart questions, indicating its effectiveness in complex multimodal tasks.

Via

Access Paper or Ask Questions