Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fatemeh Ghorbani Lohesara

Geometry-Aware Video Inpainting for Joint Headset Occlusion Removal and Face Reconstruction in Social XR

Aug 17, 2025

Fatemeh Ghorbani Lohesara, Karen Eguiazarian, Sebastian Knorr

Abstract:Head-mounted displays (HMDs) are essential for experiencing extended reality (XR) environments and observing virtual content. However, they obscure the upper part of the user's face, complicating external video recording and significantly impacting social XR applications such as teleconferencing, where facial expressions and eye gaze details are crucial for creating an immersive experience. This study introduces a geometry-aware learning-based framework to jointly remove HMD occlusions and reconstruct complete 3D facial geometry from RGB frames captured from a single viewpoint. The method integrates a GAN-based video inpainting network, guided by dense facial landmarks and a single occlusion-free reference frame, to restore missing facial regions while preserving identity. Subsequently, a SynergyNet-based module regresses 3D Morphable Model (3DMM) parameters from the inpainted frames, enabling accurate 3D face reconstruction. Dense landmark optimization is incorporated throughout the pipeline to improve both the inpainting quality and the fidelity of the recovered geometry. Experimental results demonstrate that the proposed framework can successfully remove HMDs from RGB facial videos while maintaining facial identity and realism, producing photorealistic 3D face geometry outputs. Ablation studies further show that the framework remains robust across different landmark densities, with only minor quality degradation under sparse landmark configurations.

* Journal of Electronic Imaging 34(5), 051003 (9 August 2025)

Via

Access Paper or Ask Questions

Headset: Human emotion awareness under partial occlusions multimodal dataset

Feb 14, 2024

Fatemeh Ghorbani Lohesara, Davi Rabbouni Freitas, Christine Guillemot, Karen Eguiazarian, Sebastian Knorr

Figure 1 for Headset: Human emotion awareness under partial occlusions multimodal dataset

Figure 2 for Headset: Human emotion awareness under partial occlusions multimodal dataset

Figure 3 for Headset: Human emotion awareness under partial occlusions multimodal dataset

Figure 4 for Headset: Human emotion awareness under partial occlusions multimodal dataset

Abstract:The volumetric representation of human interactions is one of the fundamental domains in the development of immersive media productions and telecommunication applications. Particularly in the context of the rapid advancement of Extended Reality (XR) applications, this volumetric data has proven to be an essential technology for future XR elaboration. In this work, we present a new multimodal database to help advance the development of immersive technologies. Our proposed database provides ethically compliant and diverse volumetric data, in particular 27 participants displaying posed facial expressions and subtle body movements while speaking, plus 11 participants wearing head-mounted displays (HMDs). The recording system consists of a volumetric capture (VoCap) studio, including 31 synchronized modules with 62 RGB cameras and 31 depth cameras. In addition to textured meshes, point clouds, and multi-view RGB-D data, we use one Lytro Illum camera for providing light field (LF) data simultaneously. Finally, we also provide an evaluation of our dataset employment with regard to the tasks of facial expression classification, HMDs removal, and point cloud reconstruction. The dataset can be helpful in the evaluation and performance testing of various XR algorithms, including but not limited to facial expression recognition and reconstruction, facial reenactment, and volumetric video. HEADSET and its all associated raw data and license agreement will be publicly available for research purposes.

* Accepted in ISMAR 2023 and published in IEEE Transactions on Visualization and Computer Graphics Dataset: https://webpages.tuni.fi/headset

Via

Access Paper or Ask Questions

Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs

Feb 14, 2024

Fatemeh Ghorbani Lohesara, Karen Egiazarian, Sebastian Knorr

Figure 1 for Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs

Figure 2 for Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs

Figure 3 for Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs

Figure 4 for Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs

Abstract:Facial video inpainting plays a crucial role in a wide range of applications, including but not limited to the removal of obstructions in video conferencing and telemedicine, enhancement of facial expression analysis, privacy protection, integration of graphical overlays, and virtual makeup. This domain presents serious challenges due to the intricate nature of facial features and the inherent human familiarity with faces, heightening the need for accurate and persuasive completions. In addressing challenges specifically related to occlusion removal in this context, our focus is on the progressive task of generating complete images from facial data covered by masks, ensuring both spatial and temporal coherence. Our study introduces a network designed for expression-based video inpainting, employing generative adversarial networks (GANs) to handle static and moving occlusions across all frames. By utilizing facial landmarks and an occlusion-free reference image, our model maintains the user's identity consistently across frames. We further enhance emotional preservation through a customized facial expression recognition (FER) loss function, ensuring detailed inpainted outputs. Our proposed framework exhibits proficiency in eliminating occlusions from facial videos in an adaptive form, whether appearing static or dynamic on the frames, while providing realistic and coherent results.

* Accepted in Electronic Imaging 2024

Via

Access Paper or Ask Questions

Expression-aware video inpainting for HMD removal in XR applications

Jan 25, 2024

Fatemeh Ghorbani Lohesara, Karen Egiazarian, Sebastian Knorr

Figure 1 for Expression-aware video inpainting for HMD removal in XR applications

Figure 2 for Expression-aware video inpainting for HMD removal in XR applications

Abstract:Head-mounted displays (HMDs) serve as indispensable devices for observing extended reality (XR) environments and virtual content. However, HMDs present an obstacle to external recording techniques as they block the upper face of the user. This limitation significantly affects social XR applications, specifically teleconferencing, where facial features and eye gaze information play a vital role in creating an immersive user experience. In this study, we propose a new network for expression-aware video inpainting for HMD removal (EVI-HRnet) based on generative adversarial networks (GANs). Our model effectively fills in missing information with regard to facial landmarks and a single occlusion-free reference image of the user. The framework and its components ensure the preservation of the user's identity across frames using the reference frame. To further improve the level of realism of the inpainted output, we introduce a novel facial expression recognition (FER) loss function for emotion preservation. Our results demonstrate the remarkable capability of the proposed framework to remove HMDs from facial videos while maintaining the subject's facial expression and identity. Moreover, the outputs exhibit temporal consistency along the inpainted frames. This lightweight framework presents a practical approach for HMD occlusion removal, with the potential to enhance various collaborative XR applications without the need for additional hardware.

* Accepted in CVMP 2023

Via

Access Paper or Ask Questions