Abstract:With the increasing demand for audiovisual services, telecom service providers and application developers are compelled to ensure that their services provide the best possible user experience. Particularly, services such as videoconferencing are very sensitive to network conditions. Therefore, their performance should be monitored in real time in order to adjust parameters to any network perturbation. In this paper, we developed a parametric model for estimating the perceived audiovisual quality in videoconference services. Our model is developed with the nonlinear autoregressive exogenous (NARX) recurrent neural network and estimates the perceived quality in terms of mean opinion score (MOS). We validate our model using the publicly available INRS bitstream audiovisual quality dataset. This dataset contains bitstream parameters such as loss per frame, bit rate and video duration. We compare the proposed model against state-of-the-art methods based on machine learning and show our model to outperform these methods in terms of mean square error (MSE=0.150) and Pearson correlation coefficient (R=0.931)