Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

Oct 19, 2022

Pengfei Li, Beiwen Tian, Yongliang Shi, Xiaoxue Chen, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

Figure 1 for TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

Figure 2 for TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

Figure 3 for TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

Figure 4 for TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

Share this with someone who'll enjoy it:

Abstract:Current referring expression comprehension algorithms can effectively detect or segment objects indicated by nouns, but how to understand verb reference is still under-explored. As such, we study the challenging problem of task oriented detection, which aims to find objects that best afford an action indicated by verbs like sit comfortably on. Towards a finer localization that better serves downstream applications like robot interaction, we extend the problem into task oriented instance segmentation. A unique requirement of this task is to select preferred candidates among possible alternatives. Thus we resort to the transformer architecture which naturally models pair-wise query relationships with attention, leading to the TOIST method. In order to leverage pre-trained noun referring expression comprehension models and the fact that we can access privileged noun ground truth during training, a novel noun-pronoun distillation framework is proposed. Noun prototypes are generated in an unsupervised manner and contextual pronoun features are trained to select prototypes. As such, the network remains noun-agnostic during inference. We evaluate TOIST on the large-scale task oriented dataset COCO-Tasks and achieve +10.9% higher $\rm{mAP^{box}}$ than the best-reported results. The proposed noun-pronoun distillation can boost $\rm{mAP^{box}}$ and $\rm{mAP^{mask}}$ by +2.8% and +3.8%. Codes and models are publicly available at https://github.com/AIR-DISCOVER/TOIST.

* Accepted by NeurIPS 2022. Codes are available at https://github.com/AIR-DISCOVER/TOIST

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

Paper and Code