Abstract:Explicit feature-grid based NeRF models have shown promising results in terms of rendering quality and significant speed-up in training. However, these methods often require a significant amount of data to represent a single scene or object. In this work, we present a compression model that aims to minimize the entropy in the frequency domain in order to effectively reduce the data size. First, we propose using the discrete cosine transform (DCT) on the tensorial radiance fields to compress the feature-grid. This feature-grid is transformed into coefficients, which are then quantized and entropy encoded, following a similar approach to the traditional video coding pipeline. Furthermore, to achieve a higher level of sparsity, we propose using an entropy parameterization technique for the frequency domain, specifically for DCT coefficients of the feature-grid. Since the transformed coefficients are optimized during the training phase, the proposed model does not require any fine-tuning or additional information. Our model only requires a lightweight compression pipeline for encoding and decoding, making it easier to apply volumetric radiance field methods for real-world applications. Experimental results demonstrate that our proposed frequency domain entropy model can achieve superior compression performance across various datasets. The source code will be made publicly available.
Abstract:The user experience in adaptive HTTP streaming relies on offering bitrate ladders with suitable operation points for all users and typically involves multiple resolutions. While open GOP coding structures are generally known to provide substantial coding efficiency benefit, their use in HTTP streaming has been precluded through lacking support of reference picture resampling (RPR) in AVC and HEVC. The newly emerging Versatile Video Coding (VVC) standard supports RPR, but only conversational scenarios were primarily investigated during the design of VVC. This paper aims at enabling usage of RPR in HTTP streaming scenarios through analysing the drift potential of VVC coding tools and presenting a constrained encoding method that avoids severe drift artefacts in resolution switching with open GOP coding in VVC. In typical live streaming configurations, the presented method achieves up to -8.7% BD-rate reduction compared to closed GOP coding while in a typical Video on Demand configuration, up to -2.4% BD-rate reduction is reported. The constraints penalty compared to regular open GOP coding is 0.53% BD-rate in the worst case. The presented method will be integrated into the publicly available open source VVC encoder VVenC v0.3.