Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

May 22, 2024

Yuhang Yang, Wei Zhai, Chengfeng Wang, Chengjun Yu, Yang Cao, Zheng-Jun Zha

Figure 1 for EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

Figure 2 for EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

Figure 3 for EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

Figure 4 for EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

Share this with someone who'll enjoy it:

Abstract:Understanding egocentric human-object interaction (HOI) is a fundamental aspect of human-centric perception, facilitating applications like AR/VR and embodied AI. For the egocentric HOI, in addition to perceiving semantics e.g., ''what'' interaction is occurring, capturing ''where'' the interaction specifically manifests in 3D space is also crucial, which links the perception and operation. Existing methods primarily leverage observations of HOI to capture interaction regions from an exocentric view. However, incomplete observations of interacting parties in the egocentric view introduce ambiguity between visual observations and interaction contents, impairing their efficacy. From the egocentric view, humans integrate the visual cortex, cerebellum, and brain to internalize their intentions and interaction concepts of objects, allowing for the pre-formulation of interactions and making behaviors even when interaction regions are out of sight. In light of this, we propose harmonizing the visual appearance, head motion, and 3D object to excavate the object interaction concept and subject intention, jointly inferring 3D human contact and object affordance from egocentric videos. To achieve this, we present EgoChoir, which links object structures with interaction contexts inherent in appearance and head motion to reveal object affordance, further utilizing it to model human contact. Additionally, a gradient modulation is employed to adopt appropriate clues for capturing interaction regions across various egocentric scenarios. Moreover, 3D contact and affordance are annotated for egocentric videos collected from Ego-Exo4D and GIMO to support the task. Extensive experiments on them demonstrate the effectiveness and superiority of EgoChoir. Code and data will be open.

* 23 pages,10 figures

View paper on

Share this with someone who'll enjoy it:

Title:EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

Paper and Code