Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning

Mar 07, 2024

Yanqi Dai, Dong Jing, Nanyi Fei, Zhiwu Lu

Figure 1 for CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning

Figure 2 for CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning

Figure 3 for CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning

Figure 4 for CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning

Share this with someone who'll enjoy it:

Abstract:Visual instruction tuning is a key training stage of large multimodal models (LMMs). Nevertheless, the common practice of indiscriminately mixing instruction-following data from various tasks may result in suboptimal overall performance due to different instruction formats and knowledge domains across tasks. To mitigate this issue, we propose a novel Comprehensive Task Balancing (CoTBal) algorithm for multi-task visual instruction tuning of LMMs. To our knowledge, this is the first work that explores multi-task optimization in visual instruction tuning. Specifically, we consider two key dimensions for task balancing: (1) Inter-Task Contribution, the phenomenon where learning one task potentially enhances the performance in other tasks, attributable to the overlapping knowledge domains, and (2) Intra-Task Difficulty, which refers to the learning difficulty within a single task. By quantifying these two dimensions with performance-based metrics, task balancing is thus enabled by assigning more weights to tasks that offer substantial contributions to others, receive minimal contributions from others, and also have great intra-task difficulties. Experiments show that our CoTBal leads to superior overall performance in multi-task visual instruction tuning.

View paper on

Share this with someone who'll enjoy it:

Title:CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning

Paper and Code