DESTEC
Abstract:In this paper, we leverage the properties of non-Euclidean Geometry to define the Geodesic distance (GD) on the space of statistical manifolds. The Geodesic distance is a real and intuitive similarity measure that is a good alternative to the purely statistical and extensively used Kullback-Leibler divergence (KLD). Despite the effectiveness of the GD, a closed-form does not exist for many manifolds, since the geodesic equations are hard to solve. This explains that the major studies have been content to use numerical approximations. Nevertheless, most of those do not take account of the manifold properties, which leads to a loss of information and thus to low performances. We propose an approximation of the Geodesic distance through a graph-based method. This latter permits to well represent the structure of the statistical manifold, and respects its geometrical properties. Our main aim is to compare the graph-based approximation to the state of the art approximations. Thus, the proposed approach is evaluated for two statistical manifolds, namely the Weibull manifold and the Gamma manifold, considering the Content-Based Texture Retrieval application on different databases.
Abstract:This paper addresses the problem of blind stereoscopic image quality assessment (NR-SIQA) using a new multi-task deep learning based-method. In the field of stereoscopic vision, the information is fairly distributed between the left and right views as well as the binocular phenomenon. In this work, we propose to integrate these characteristics to estimate the quality of stereoscopic images without reference through a convolutional neural network. Our method is based on two main tasks: the first task predicts naturalness analysis based features adapted to stereo images, while the second task predicts the quality of such images. The former, so-called auxiliary task, aims to find more robust and relevant features to improve the quality prediction. To do this, we compute naturalness-based features using a Natural Scene Statistics (NSS) model in the complex wavelet domain. It allows to capture the statistical dependency between pairs of the stereoscopic images. Experiments are conducted on the well known LIVE PHASE I and LIVE PHASE II databases. The results obtained show the relevance of our method when comparing with those of the state-of-the-art. Our code is available online on https://github.com/Bourbia-Salima/multitask-cnn-nrsiqa_2021.
Abstract:Discovering content and stories in movies is one of the most important concepts in multimedia content research studies. Network models have proven to be an efficient choice for this purpose. When an audience watches a movie, they usually compare the characters and the relationships between them. For this reason, most of the models developed so far are based on social networks analysis. They focus essentially on the characters at play. By analyzing characters' interactions, we can obtain a broad picture of the narration's content. Other works have proposed to exploit semantic elements such as scenes, dialogues, etc. However, they are always captured from a single facet. Motivated by these limitations, we introduce in this work a multilayer network model to capture the narration of a movie based on its script, its subtitles, and the movie content. After introducing the model and the extraction process from the raw data, we perform a comparative analysis of the whole 6-movie cycle of the Star Wars saga. Results demonstrate the effectiveness of the proposed framework for video content representation and analysis.
Abstract:With the recent advances in complex networks theory, graph-based techniques for image segmentation has attracted great attention recently. In order to segment the image into meaningful connected components, this paper proposes an image segmentation general framework using complex networks based community detection algorithms. If we consider regions as communities, using community detection algorithms directly can lead to an over-segmented image. To address this problem, we start by splitting the image into small regions using an initial segmentation. The obtained regions are used for building the complex network. To produce meaningful connected components and detect homogeneous communities, some combinations of color and texture based features are employed in order to quantify the regions similarities. To sum up, the network of regions is constructed adaptively to avoid many small regions in the image, and then, community detection algorithms are applied on the resulting adaptive similarity matrix to obtain the final segmented image. Experiments are conducted on Berkeley Segmentation Dataset and four of the most influential community detection algorithms are tested. Experimental results have shown that the proposed general framework increases the segmentation performances compared to some existing methods.
Abstract:Network models have been increasingly used in the past years to support summarization and analysis of narratives, such as famous TV series, books and news. Inspired by social network analysis, most of these models focus on the characters at play. The network model well captures all characters interactions, giving a broad picture of the narration's content. A few works went beyond by introducing additional semantic elements, always captured in a single layer network. In contrast, we introduce in this work a multilayer network model to capture more elements of the narration of a movie from its script: people, locations, and other semantic elements. This model enables new measures and insights on movies. We demonstrate this model on two very popular movies.
Abstract:This paper deals with a reduced reference (RR) image quality measure based on natural image statistics modeling. For this purpose, Tetrolet transform is used since it provides a convenient way to capture local geometric structures. This transform is applied to both reference and distorted images. Then, Gaussian Scale Mixture (GSM) is proposed to model subbands in order to take account statistical dependencies between tetrolet coefficients. In order to quantify the visual degradation, a measure based on Kullback Leibler Divergence (KLD) is provided. The proposed measure was tested on the Cornell VCL A-57 dataset and compared with other measures according to FR-TV1 VQEG framework.
Abstract:This paper deals with color image quality assessment in the reduced-reference framework based on natural scenes statistics. In this context, we propose to model the statistics of the steerable pyramid coefficients by a Multivariate Generalized Gaussian distribution (MGGD). This model allows taking into account the high correlation between the components of the RGB color space. For each selected scale and orientation, we extract a parameter matrix from the three color components subbands. In order to quantify the visual degradation, we use a closed-form of Kullback-Leibler Divergence (KLD) between two MGGDs. Using "TID 2008" benchmark, the proposed measure has been compared with the most influential methods according to the FRTV1 VQEG framework. Results demonstrates its effectiveness for a great variety of distortion type. Among other benefits this measure uses only very little information about the original image.
Abstract:Color distortion can introduce a significant damage in visual quality perception, however, most of existing reduced-reference quality measures are designed for grayscale images. In this paper, we consider a basic extension of well-known image-statistics based quality assessment measures to color images. In order to evaluate the impact of color information on the measures efficiency, two color spaces are investigated: RGB and CIELAB. Results of an extensive evaluation using TID 2013 benchmark demonstrates that significant improvement can be achieved for a great number of distortion type when the CIELAB color representation is used.
Abstract:Although color is a fundamental feature of human visual perception, it has been largely unexplored in the reduced-reference (RR) image quality assessment (IQA) schemes. In this paper, we propose a natural scene statistic (NSS) method, which efficiently uses this information. It is based on the statistical deviation between the steerable pyramid coefficients of the reference color image and the degraded one. We propose and analyze the multivariate generalized Gaussian distribution (MGGD) to model the underlying statistics. In order to quantify the degradation, we develop and evaluate two measures based respectively on the Geodesic distance between two MGGDs and on the closed-form of the Kullback Leibler divergence. We performed an extensive evaluation of both metrics in various color spaces (RGB, HSV, CIELAB and YCrCb) using the TID 2008 benchmark and the FRTV Phase I validation process. Experimental results demonstrate the effectiveness of the proposed framework to achieve a good consistency with human visual perception. Furthermore, the best configuration is obtained with CIELAB color space associated to KLD deviation measure.
Abstract:In this paper, we introduce a Reduced Reference Image Quality Assessment (RRIQA) measure based on the natural image statistic approach. A new adaptive transform called "Tetrolet" is applied to both reference and distorted images. To model the marginal distribution of tetrolet coefficients Bessel K Forms (BKF) density is proposed. Estimating the parameters of this distribution allows to summarize the reference image with a small amount of side information. Five distortion measures based on the BKF parameters of the original and processed image are used to predict quality scores. A comparison between these measures is presented showing a good consistency with human judgment.