Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Visual-Semantic Graph Attention Network for Human-Object Interaction Detection

Jan 07, 2020

Zhijun Liang, Yisheng Guan, Juan Rojas

Figure 1 for Visual-Semantic Graph Attention Network for Human-Object Interaction Detection

Figure 2 for Visual-Semantic Graph Attention Network for Human-Object Interaction Detection

Figure 3 for Visual-Semantic Graph Attention Network for Human-Object Interaction Detection

Figure 4 for Visual-Semantic Graph Attention Network for Human-Object Interaction Detection

Share this with someone who'll enjoy it:

Abstract:In scene understanding, machines benefit from not only detecting individual scene instances but also from learning their possible interactions. Human-Object Interaction (HOI) Detection tries to infer the predicate on a <subject,predicate,object> triplet. Contextual information has been found critical in inferring interactions. However, most works use features from single object instances that have a direct relation with the subject. Few works have studied the disambiguating contribution of subsidiary relations in addition to how attention might leverage them for inference. We contribute a dual-graph attention network that aggregates contextual visual, spatial, and semantic information dynamically for primary subject-object relations as well as subsidiary relations. Graph attention networks dynamically leverage node neighborhood information. Our network uses attention to first leverage visual-spatial and semantic cues from primary and subsidiary relations independently and then combines them before a final readout step. Our network learns to use primary and subsidiary relations to improve inference: encouraging the right interpretations and discouraging incorrect ones. We call our model: Visual-Semantic Graph Attention Networks (VS-GATs). We surpass state-of-the-art HOI detection mAPs in the challenging HICO-DET dataset, including in long-tail cases that are harder to interpret. Code, video, and supplementary information is available at http://www.juanrojas.net/VSGAT.

* 10 pages, 3 figures, 2 tables

View paper on

Share this with someone who'll enjoy it:

Title:Visual-Semantic Graph Attention Network for Human-Object Interaction Detection

Paper and Code