Abstract:Despite recent advances in human pose estimation (HPE), poor generalization to out-of-distribution (OOD) data remains a difficult problem. While previous works have proposed Test-Time Adaptation (TTA) to bridge the train-test domain gap by refining network parameters at inference, the absence of ground-truth annotations makes it highly challenging and existing methods typically increase inference times by one or more orders of magnitude. We observe that 1) not every test time sample is OOD, and 2) HPE errors are significantly larger on distal keypoints (wrist, ankle). To this end, we propose ESCAPE: a lightweight correction and selective adaptation framework which applies a fast, forward-pass correction on most data while reserving costly TTA for OOD data. The free energy function is introduced to separate OOD samples from incoming data and a correction network is trained to estimate the errors of pretrained backbone HPE predictions on the distal keypoints. For OOD samples, we propose a novel self-consistency adaptation loss to update the correction network by leveraging the constraining relationship between distal keypoints and proximal keypoints (shoulders, hips), via a second ``reverse" network. ESCAPE improves the distal MPJPE of five popular HPE models by up to 7% on unseen data, achieves state-of-the-art results on two popular HPE benchmarks, and is significantly faster than existing adaptation methods.