Abstract:HTTP adaptive streaming (HAS) has emerged as a widely adopted approach for over-the-top (OTT) video streaming services, due to its ability to deliver a seamless streaming experience. A key component of HAS is the bitrate ladder, which provides the encoding parameters (e.g., bitrate-resolution pairs) to encode the source video. The representations in the bitrate ladder allow the client's player to dynamically adjust the quality of the video stream based on network conditions by selecting the most appropriate representation from the bitrate ladder. The most straightforward and lowest complexity approach involves using a fixed bitrate ladder for all videos, consisting of pre-determined bitrate-resolution pairs known as one-size-fits-all. Conversely, the most reliable technique relies on intensively encoding all resolutions over a wide range of bitrates to build the convex hull, thereby optimizing the bitrate ladder for each specific video. Several techniques have been proposed to predict content-based ladders without performing a costly exhaustive search encoding. This paper provides a comprehensive review of various methods, including both conventional and learning-based approaches. Furthermore, we conduct a benchmark study focusing exclusively on various learning-based approaches for predicting content-optimized bitrate ladders across multiple codec settings. The considered methods are evaluated on our proposed large-scale dataset, which includes 300 UHD video shots encoded with software and hardware encoders using three state-of-the-art encoders, including AVC/H.264, HEVC/H.265, and VVC/H.266, at various bitrate points. Our analysis provides baseline methods and insights, which will be valuable for future research in the field of bitrate ladder prediction. The source code of the proposed benchmark and the dataset will be made publicly available upon acceptance of the paper.
Abstract:Deep neural network (DNN) based approaches have been intensively studied to improve video quality thanks to their fast advancement in recent years. These approaches are designed mainly for desktop devices due to their high computational cost. However, with the increasing performance of mobile devices in recent years, it became possible to execute DNN based approaches in mobile devices. Despite having the required computational power, utilizing DNNs to improve the video quality for mobile devices is still an active research area. In this paper, we propose an open-source mobile platform, namely MoViDNN, to evaluate DNN based video quality enhancement methods, such as super-resolution, denoising, and deblocking. Our proposed platform can be used to evaluate the DNN based approaches both objectively and subjectively. For objective evaluation, we report common metrics such as execution time, PSNR, and SSIM. For subjective evaluation, Mean Score Opinion (MOS) is reported. The proposed platform is available publicly at https://github.com/cd-athena/MoViDNN