Abstract:As the availability of imagery data continues to swell, so do the demands on transmission, storage and processing power. Processing requirements to handle this plethora of data is quickly outpacing the utility of conventional processing techniques. Transitioning to quantum processing and algorithms that offer promising efficiencies over conventional methods can address some of these issues. However, to make this transformation possible, fundamental issues of implementing real time Quantum algorithms must be overcome for crucial processes needed for intelligent analysis applications. For example, consider edge detection tasks which require time-consuming acquisition processes and are further hindered by the complexity of the devices used thus limiting feasibility for implementation in real-time applications. Convolution is another example of an operation that is essential for signal and image processing applications, where the mathematical operations consist of an intelligent mixture of multiplication and addition that require considerable computational resources. This paper studies a new paired transform-based quantum representation and computation of one-dimensional and 2-D signals convolutions and gradients. A new visual data representation is defined to simplify convolution calculations making it feasible to parallelize convolution and gradient operations for more efficient performance. The new data representation is demonstrated on multiple illustrative examples for quantum edge detection, gradients, and convolution. Furthermore, the efficiency of the proposed approach is shown on real-world images.
Abstract:Images captured in real-world applications in remote sensing, image or video retrieval, and outdoor surveillance suffer degraded quality introduced by poor weather conditions. Conditions such as rain and mist, introduce artifacts that make visual analysis challenging and limit the performance of high-level computer vision methods. For time-critical applications where a rapid response is necessary, it becomes crucial to develop algorithms that automatically remove rain, without diminishing the quality of the image contents. This article aims to develop a novel quaternion multi-stage multiscale neural network with a self-attention module called QSAM-Net to remove rain streaks. The novelty of this algorithm is that it requires significantly fewer parameters by a factor of 3.98, over prior methods, while improving visual quality. This is demonstrated by the extensive evaluation and benchmarking on synthetic and real-world rainy images. This feature of QSAM-Net makes the network suitable for implementation on edge devices and applications requiring near real-time performance. The experiments demonstrate that by improving the visual quality of images. In addition, object detection accuracy and training speed are also improved.
Abstract:Predicting future sensory states is crucial for learning agents such as robots, drones, and autonomous vehicles. In this paper, we couple multiple sensory modalities with exploratory actions and propose a predictive neural network architecture to address this problem. Most existing approaches rely on large, manually annotated datasets, or only use visual data as a single modality. In contrast, the unsupervised method presented here uses multi-modal perceptions for predicting future visual frames. As a result, the proposed model is more comprehensive and can better capture the spatio-temporal dynamics of the environment, leading to more accurate visual frame prediction. The other novelty of our framework is the use of sub-networks dedicated to anticipating future haptic, audio, and tactile signals. The framework was tested and validated with a dataset containing 4 sensory modalities (vision, haptic, audio, and tactile) on a humanoid robot performing 9 behaviors multiple times on a large set of objects. While the visual information is the dominant modality, utilizing the additional non-visual modalities improves the accuracy of predictions.