Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jihong Hu

Multitask and Multimodal Neural Tuning for Large Models

Aug 06, 2024

Hao Sun, Yu Song, Jihong Hu, Yen-Wei Chen, Lanfen Lin

Figure 1 for Multitask and Multimodal Neural Tuning for Large Models

Figure 2 for Multitask and Multimodal Neural Tuning for Large Models

Figure 3 for Multitask and Multimodal Neural Tuning for Large Models

Figure 4 for Multitask and Multimodal Neural Tuning for Large Models

Abstract:In recent years, large-scale multimodal models have demonstrated impressive capabilities across various domains. However, enabling these models to effectively perform multiple multimodal tasks simultaneously remains a significant challenge. To address this, we introduce a novel tuning method called neural tuning, designed to handle diverse multimodal tasks concurrently, including reasoning segmentation, referring segmentation, image captioning, and text-to-image generation. Neural tuning emulates sparse distributed representation in human brain, where only specific subsets of neurons are activated for each task. Additionally, we present a new benchmark, MMUD, where each sample is annotated with multiple task labels. By applying neural tuning to pretrained large models on the MMUD benchmark, we achieve simultaneous task handling in a streamlined and efficient manner. All models, code, and datasets will be publicly available after publication, facilitating further research and development in this field.

Via

Access Paper or Ask Questions