Abstract:Remote Photoplethysmography (rPPG) aims to measure physiological signals and Heart Rate (HR) from facial videos. Recent unsupervised rPPG estimation methods have shown promising potential in estimating rPPG signals from facial regions without relying on ground truth rPPG signals. However, these methods seem oblivious to interference existing in rPPG signals and still result in unsatisfactory performance. In this paper, we propose a novel De-interfered and Descriptive rPPG Estimation Network (DD-rPPGNet) to eliminate the interference within rPPG features for learning genuine rPPG signals. First, we investigate the characteristics of local spatial-temporal similarities of interference and design a novel unsupervised model to estimate the interference. Next, we propose an unsupervised de-interfered method to learn genuine rPPG signals with two stages. In the first stage, we estimate the initial rPPG signals by contrastive learning from both the training data and their augmented counterparts. In the second stage, we use the estimated interference features to derive de-interfered rPPG features and encourage the rPPG signals to be distinct from the interference. In addition, we propose an effective descriptive rPPG feature learning by developing a strong 3D Learnable Descriptive Convolution (3DLDC) to capture the subtle chrominance changes for enhancing rPPG estimation. Extensive experiments conducted on five rPPG benchmark datasets demonstrate that the proposed DD-rPPGNet outperforms previous unsupervised rPPG estimation methods and achieves competitive performances with state-of-the-art supervised rPPG methods.
Abstract:Many remote photoplethysmography (rPPG) estimation models have achieved promising performance on the training domain but often fail to measure the physiological signals or heart rates (HR) on test domains. Domain generalization (DG) or domain adaptation (DA) techniques are therefore adopted in the offline training stage to adapt the model to the unobserved or observed test domain by referring to all the available source domain data. However, in rPPG estimation problems, the adapted model usually confronts challenges of estimating target data with various domain information, such as different video capturing settings, individuals of different age ranges, or of different HR distributions. In contrast, Test-Time Adaptation (TTA), by online adapting to unlabeled target data without referring to any source data, enables the model to adaptively estimate rPPG signals of various unseen domains. In this paper, we first propose a novel TTA-rPPG benchmark, which encompasses various domain information and HR distributions, to simulate the challenges encountered in rPPG estimation. Next, we propose a novel synthetic signal-guided rPPG estimation framework with a two-fold purpose. First, we design an effective spectral-based entropy minimization to enforce the rPPG model to learn new target domain information. Second, we develop a synthetic signal-guided feature learning, by synthesizing pseudo rPPG signals as pseudo ground-truths to guide a conditional generator to generate latent rPPG features. The synthesized rPPG signals and the generated rPPG features are used to guide the rPPG model to broadly cover various HR distributions. Our extensive experiments on the TTA-rPPG benchmark show that the proposed method achieves superior performance and outperforms previous DG and DA methods across most protocols of the proposed TTA-rPPG benchmark.