Ultra-wideband (UWB) devices are widely used in indoor localization scenarios. Single-anchor UWB localization shows advantages because of its simple system setup compared to conventional two-way ranging (TWR) and trilateration localization methods. In this work, we focus on single-anchor UWB localization methods that learn statistical features of the channel impulse response (CIR) in different location areas using a Gaussian mixture model (GMM). We show that by learning the joint distributions of the amplitudes of different delay components, we achieve a more accurate location estimate compared to considering each delay bin independently. Moreover, we develop a similarity metric between sets of CIRs. With this set-based similarity metric, we can further improve the estimation performance, compared to treating each snapshot separately. We showcase the advantages of the proposed methods in multiple application scenarios.