Abstract:To open up new possibilities to assess the multimodal perceptual quality of omnidirectional media formats, we proposed a novel open source 360 audiovisual (AV) quality dataset. The dataset consists of high-quality 360 video clips in equirectangular (ERP) format and higher-order ambisonic (4th order) along with the subjective scores. Three subjective quality experiments were conducted for audio, video, and AV with the procedures detailed in this paper. Using the data from subjective tests, we demonstrated that this dataset can be used to quantify perceived audio, video, and audiovisual quality. The diversity and discriminability of subjective scores were also analyzed. Finally, we investigated how our dataset correlates with various objective quality metrics of audio and video. Evidence from the results of this study implies that the proposed dataset can benefit future studies on multimodal quality evaluation of 360 content.
Abstract:In an earlier study, we gathered perceptual evaluations of the audio, video, and audiovisual quality for 360 audiovisual content. This paper investigates perceived audiovisual quality prediction based on objective quality metrics and subjective scores of 360 video and spatial audio content. Thirteen objective video quality metrics and three objective audio quality metrics were evaluated for five stimuli for each coding parameter. Four regression-based machine learning models were trained and tested here, i.e., multiple linear regression, decision tree, random forest, and support vector machine. Each model was constructed using a combination of audio and video quality metrics and two cross-validation methods (k-Fold and Leave-One-Out) were investigated and produced 312 predictive models. The results indicate that the model based on the evaluation of VMAF and AMBIQUAL is better than other combinations of audio-video quality metric. In this study, support vector machine provides higher performance using k-Fold (PCC = 0.909, SROCC = 0.914, and RMSE = 0.416). These results can provide insights for the design of multimedia quality metrics and the development of predictive models for audiovisual omnidirectional media.