Abstract:Since 2015 video dimensionality has expanded to higher spatial and temporal resolutions and a wider colour gamut. This High Dynamic Range (HDR) content has gained traction in the consumer space as it delivers an enhanced quality of experience. At the same time, the complexity of codecs is growing. This has driven the development of tools for content-adaptive optimisation that achieve optimal rate-distortion performance for HDR video at 4K resolution. While improvements of just a few percentage points in BD-Rate (1-5\%) are significant for the streaming media industry, the impact on subjective quality has been less studied especially for HDR/AV1. In this paper, we conduct a subjective quality assessment (42 subjects) of 4K HDR content with a per-clip optimisation strategy. We correlate these subjective scores with existing popular objective metrics used in standard development and show that some perceptual metrics correlate surprisingly well even though they are not tuned for HDR. We find that the DSQCS protocol is too insensitive to categorically compare the methods but the data allows us to make recommendations about the use of experts vs non-experts in HDR studies, and explain the subjective impact of film grain in HDR content under compression.
Abstract:Over the past few years, there has been an increase in the demand and availability of High Dynamic Range (HDR) displays and content. To ensure the production of high-quality materials, human evaluation is required. However, ascertaining whether the full playback pipeline is indeed HDR-compliant can be challenging. In this paper, we present a set of recommendations for conformance testing to validate various aspects of the testing workflow, including playback, displays, brightness, colours, and viewing environment. We assessed the effectiveness of HDR conversion techniques used in current standards development (3GPP) for making source materials. Additionally, we evaluate HDR display technologies, including OLED and LCD, using both consumer television and a reference monitor.
Abstract:Cloud-based deployment of content production and broadcast workflows has continued to disrupt the industry after the pandemic. The key tools required for unlocking cloud workflows, e.g., transcoding, metadata parsing, and streaming playback, are increasingly commoditized. However, as video traffic continues to increase there is a need to consider tools which offer opportunities for further bitrate/quality gains as well as those which facilitate cloud deployment. In this paper we consider preprocessing, rate/distortion optimisation and cloud cost prediction tools which are only just emerging from the research community. These tools are posed as part of the per-clip optimisation approach to transcoding which has been adopted by large streaming media processing entities but has yet to be made more widely available for the industry.
Abstract:The complexity of modern codecs along with the increased need of delivering high-quality videos at low bitrates has reinforced the idea of a per-clip tailoring of parameters for optimised rate-distortion performance. While the objective quality metrics used for Standard Dynamic Range (SDR) videos have been well studied, the transitioning of consumer displays to support High Dynamic Range (HDR) videos, poses a new challenge to rate-distortion optimisation. In this paper, we review the popular HDR metrics DeltaE100 (DE100), PSNRL100, wPSNR, and HDR-VQM. We measure the impact of employing these metrics in per-clip direct search optimisation of the rate-distortion Lagrange multiplier in AV1. We report, on 35 HDR videos, average Bjontegaard Delta Rate (BD-Rate) gains of 4.675%, 2.226%, and 7.253% in terms of DE100, PSNRL100, and HDR-VQM. We also show that the inclusion of chroma in the quality metrics has a significant impact on optimisation, which can only be partially addressed by the use of chroma offsets.
Abstract:This study examines the relationship between H.264 video compression and the performance of an object detection network (YOLOv5). We curated a set of 50 surveillance videos and annotated targets of interest (people, bikes, and vehicles). Videos were encoded at 5 quality levels using Constant Rate Factor (CRF) values in the set {22,32,37,42,47}. YOLOv5 was applied to compressed videos and detection performance was analyzed at each CRF level. Test results indicate that the detection performance is generally robust to moderate levels of compression; using a CRF value of 37 instead of 22 leads to significantly reduced bitrates/file sizes without adversely affecting detection performance. However, detection performance degrades appreciably at higher compression levels, especially in complex scenes with poor lighting and fast-moving targets. Finally, retraining YOLOv5 on compressed imagery gives up to a 1% improvement in F1 score when applied to highly compressed footage.
Abstract:Since the adoption of VP9 by Netflix in 2016, royalty-free coding standards continued to gain prominence through the activities of the AOMedia consortium. AV1, the latest open source standard, is now widely supported. In the early years after standardisation, HDR video tends to be under served in open source encoders for a variety of reasons including the relatively small amount of true HDR content being broadcast and the challenges in RD optimisation with that material. AV1 codec optimisation has been ongoing since 2020 including consideration of the computational load. In this paper, we explore the idea of direct optimisation of the Lagrangian $\lambda$ parameter used in the rate control of the encoders to estimate the optimal Rate-Distortion trade-off achievable for a High Dynamic Range signalled video clip. We show that by adjusting the Lagrange multiplier in the RD optimisation process on a frame-hierarchy basis, we are able to increase the Bjontegaard difference rate gains by more than 3.98$\times$ on average without visually affecting the quality.
Abstract:Video transcoding is an increasingly important application in the streaming media industry. It has become important to investigate the optimisation of transcoder parameters for a single clip simply because of the immense number of playbacks for popular clips. In this paper, we explore the use of a canned optimiser to estimate the optimal RD tradeoff achievable for a particular clip. We show that by adjusting the Lagrange multiplier in RD optimisation on keyframes alone we can achieve more than 10$\times$ the previous BD-Rate gains possible without affecting quality for any operating point.