https://github.com/FAU-LMS/Learned-pMCTF.
Learned wavelet video coders provide an explainable framework by performing discrete wavelet transforms in temporal, horizontal, and vertical dimensions. With a temporal transform based on motion-compensated temporal filtering (MCTF), spatial and temporal scalability is obtained. In this paper, we introduce variable rate support and a mechanism for quality adaption to different temporal layers for a higher coding efficiency. Moreover, we propose a multi-stage training strategy that allows training with multiple temporal layers. Our experiments demonstrate Bj{\o}ntegaard Delta bitrate savings of at least -17% compared to a learned MCTF model without these extensions. Our method also outperforms other learned video coders like DCVC-DC. Training and inference code is available at: