Abstract:The estimation of depth in two-dimensional images has long been a challenging and extensively studied subject in computer vision. Recently, significant progress has been made with the emergence of Deep Learning-based approaches, which have proven highly successful. This paper focuses on the explainability in monocular depth estimation methods, in terms of how humans perceive depth. This preliminary study emphasizes on one of the most significant visual cues, the relative size, which is prominent in almost all viewed images. We designed a specific experiment to mimic the experiments in humans and have tested state-of-the-art methods to indirectly assess the explainability in the context defined. In addition, we observed that measuring the accuracy required further attention and a particular approach is proposed to this end. The results show that a mean accuracy of around 77% across methods is achieved, with some of the methods performing markedly better, thus, indirectly revealing their corresponding potential to uncover monocular depth cues, like relative size.
Abstract:Cellular Automata (CA) have been considered one of the most pronounced parallel computational tools in the recent era of nature and bio-inspired computing. Taking advantage of their local connectivity, the simplicity of their design and their inherent parallelism, CA can be effectively applied to many image processing tasks. In this paper, a CA approach for efficient salt-n-pepper noise filtering in grayscale images is presented. Using a 2D Moore neighborhood, the classified "noisy" cells are corrected by averaging the non-noisy neighboring cells. While keeping the computational burden really low, the proposed approach succeeds in removing high-noise levels from various images and yields promising qualitative and quantitative results, compared to state-of-the-art techniques.
Abstract:Directional or Circular statistics are pertaining to the analysis and interpretation of directions or rotations. In this work, a novel probability distribution is proposed to model multidimensional sparse directional data. The Generalised Directional Laplacian Distribution (DLD) is a hybrid between the Laplacian distribution and the von Mises-Fisher distribution. The distribution's parameters are estimated using Maximum-Likelihood Estimation over a set of training data points. Mixtures of Directional Laplacian Distributions (MDLD) are also introduced in order to model multiple concentrations of sparse directional data. The author explores the application of the derived DLD mixture model to cluster sound sources that exist in an underdetermined instantaneous sound mixture. The proposed model can solve the general K x L (K<L) underdetermined instantaneous source separation problem, offering a fast and stable solution.
Abstract:Audio source separation is the task of isolating sound sources that are active simultaneously in a room captured by a set of microphones. Convolutive audio source separation of equal number of sources and microphones has a number of shortcomings including the complexity of frequency-domain ICA, the permutation ambiguity and the problem's scalabity with increasing number of sensors. In this paper, the authors propose a multiple-microphone audio source separation algorithm based on a previous work of Mitianoudis and Davies (2003). Complex FastICA is substituted by Robust ICA increasing robustness and performance. Permutation ambiguity is solved using two methodologies. The first is using the Likelihood Ration Jump solution, which is now modified to decrease computational complexity in the case of multiple microphones. The application of the MuSIC algorithm, as a preprocessing step to the previous solution, forms a second methodology with promising results.