Tsinghua University, Peking University
Abstract:We introduce a cutting-edge video compression framework tailored for the age of ubiquitous video data, uniquely designed to serve machine learning applications. Unlike traditional compression methods that prioritize human visual perception, our innovative approach focuses on preserving semantic information critical for deep learning accuracy, while efficiently reducing data size. The framework operates on a batch basis, capable of handling multiple video streams simultaneously, thereby enhancing scalability and processing efficiency. It features a dual reconstruction mode: lightweight for real-time applications requiring swift responses, and high-precision for scenarios where accuracy is crucial. Based on a designed deep learning algorithms, it adeptly segregates essential information from redundancy, ensuring machine learning tasks are fed with data of the highest relevance. Our experimental results, derived from diverse datasets including urban surveillance and autonomous vehicle navigation, showcase DMVC's superiority in maintaining or improving machine learning task accuracy, while achieving significant data compression. This breakthrough paves the way for smarter, scalable video analysis systems, promising immense potential across various applications from smart city infrastructure to autonomous systems, establishing a new benchmark for integrating video compression with machine learning.
Abstract:Mobile deep vision systems play a vital role in numerous scenarios. However, deep learning applications in mobile vision scenarios face problems such as tight computing resources. With the development of edge computing, the architecture of edge clouds has mitigated some of the issues related to limited computing resources. However, it has introduced increased latency. To address these challenges, we designed CloudEye which consists of Fast Inference Module, Feature Mining Module and Quality Encode Module. CloudEye is a real-time, efficient mobile visual perception system that leverages content information mining on edge servers in a mobile vision system environment equipped with edge servers and coordinated with cloud servers. Proven by sufficient experiments, we develop a prototype system that reduces network bandwidth usage by 69.50%, increases inference speed by 24.55%, and improves detection accuracy by 67.30%
Abstract:Single-pixel imaging (SPI) using a single-pixel detector is an unconventional imaging method, which has great application prospects in many fields to realize high-performance imaging. In especial, the recent proposed catadioptric panoramic ghost imaging (CPGI) extends the application potential of SPI to high-performance imaging at a wide field of view (FOV) with recent growing demands. However, the resolution of CPGI is limited by hardware parameters of the digital micromirror device (DMD), which may not meet ultrahigh-resolution panoramic imaging needs that require detailed information. Therefore, to overcome the resolution limitation of CPGI, we propose a panoramic SPI based on rotational subdivision (RSPSI). The key of the proposed RSPSI is to obtain the entire panoramic scene by the rotation-scanning with a rotating mirror tilted 45{\deg}, so that one single pattern that only covers one sub-Fov with a small FOV can complete a uninterrupted modulation on the entire panoramic FOV during a once-through pattern projection. Then, based on temporal resolution subdivision, images sequence of sub-Fovs subdivided from the entire panoramic FOV can be reconstructed with pixels-level or even subpixels-level horizontal shifting adjacently. Experimental results using a proof-of-concept setup show that the panoramic image can be obtained with 10428*543 of 5,662,404 pixels, which is more than 9.6 times higher than the resolution limit of the CPGI using the same DMD. To our best knowledge, the RSPSI is the first to achieve a megapixel resolution via SPI, which can provide potential applications in fields requiring the imaging with ultrahigh-resolution and wide FOV.
Abstract:Fourier single-pixel imaging (FSI) is a data-efficient single-pixel imaging (SPI). However, there is still a serious challenge to obtain higher imaging quality using fewer measurements, which limits the development of real-time SPI. In this work, a uniform-sampling foveated FSI (UFFSI) is proposed with three features, uniform sampling, effective sampling and flexible fovea, to achieve under-sampling high-efficiency and high-quality SPI, even in a large-scale scene. First, by flexibly using the three proposed foveated pattern structures, data redundancy is reduced significantly to only require high resolution (HR) on regions of interest (ROIs), which radically reduces the need of total data number. Next, by the non-uniform weight distribution processing, non-uniform spatial sampling is transformed into uniform sampling, then the fast Fourier transform is used accurately and directly to obtain under-sampling high imaging quality with further reduced measurements. At a sampling ratio of 0.0084 referring to HR FSI with 1024*768 pixels, experimentally, by UFFSI with 255*341 cells of 89% reduction in data redundancy, the ROI has a significantly better imaging quality to meet imaging needs. We hope this work can provide a breakthrough for future real-time SPI.
Abstract:Conventional computational ghost imaging (CGI) uses light carrying a sequence of patterns with uniform-resolution to illuminate the object, then performs correlation calculation based on the light intensity value reflected by the target and the preset patterns to obtain object image. It requires a large number of measurements to obtain high-quality images, especially if high-resolution images are to be obtained. To solve this problem, we developed temporally variable-resolution illumination patterns, replacing the conventional uniform-resolution illumination patterns with a sequence of patterns of different imaging resolutions. In addition, we propose to combine temporally variable-resolution illumination patterns and spatially variable-resolution structure to develop temporally and spatially variable-resolution (TSV) illumination patterns, which not only improve the imaging quality of the region of interest (ROI) but also improve the robustness to noise. The methods using proposed illumination patterns are verified by simulations and experiments compared with CGI. For the same number of measurements, the method using temporally variable-resolution illumination patterns has better imaging quality than CGI, but it is less robust to noise. The method using TSV illumination patterns has better imaging quality in ROI than the method using temporally variable-resolution illumination patterns and CGI under the same number of measurements. We also experimentally verify that the method using TSV patterns have better imaging performance when applied to higher resolution imaging. The proposed methods are expected to solve the current computational ghost imaging that is difficult to achieve high-resolution and high-quality imaging.
Abstract:Ghost imaging (GI) is a novel imaging method, which can reconstruct the object information by the light intensity correlation measurements. However, at present, the field of view (FOV) is limited to the illuminating range of the light patterns. To enlarge FOV of GI efficiently, here we proposed the omnidirectional ghost imaging system (OGIS), which can achieve a 360{\deg} omnidirectional FOV at one shot only by adding a curved mirror. Moreover, by designing the retina-like annular patterns with log-polar patterns, OGIS can obtain unwrapping-free undistorted panoramic images with uniform resolution, which opens up a new way for the application of GI.
Abstract:Ghost imaging (GI) reconstructs images using a single-pixel or bucket detector, which has the advantages of scattering robustness, wide spectrum and beyond-visual-field imaging. However, this technique needs large amount of measurements to obtain a sharp image. There have been a lot of methods proposed to overcome this disadvantage. Retina-like patterns, as one of the compressive sensing approaches, enhance the imaging quality of region of interest (ROI) while not increase measurements. The design of the retina-like patterns determines the performance of the ROI in the reconstructed image. Unlike the conventional method to fill in ROI with random patterns, we propose to optimize retina-like patterns by filling in the ROI with the patterns containing the sparsity prior of objects. This proposed method is verified by simulations and experiments compared with conventional GI, retina-like GI and GI using patterns optimized by principal component analysis. The method using optimized retina-like patterns obtain the best imaging quality in ROI than other methods. Meanwhile, the good generalization ability of the optimized retina-like pattern is also verified. While designing the size and position of the ROI of retina-like pattern, the feature information of the target can be obtained to optimize the pattern of ROI. This proposed method paves the way for realizing high-quality GI.
Abstract:Single-pixel imaging, with the advantages of a wide spectrum, beyond-visual-field imaging, and robustness to light scattering, has attracted increasing attention in recent years. Fourier single-pixel imaging (FSI) can reconstruct sharp images under sub-Nyquist sampling. However, the conventional FSI has difficulty with balancing the imaging quality and efficiency. To overcome this issue, we proposed a novel approach called complementary Fourier single-pixel imaging (CFSI) to reduce measurements while retaining its robustness. The complementary nature of Fourier patterns based on a four-step phase-shift algorithm is combined with the complementary nature of a digital micromirror device. CFSI only requires two phase-shifted patterns to obtain one Fourier spectral value. Four light intensity values are obtained by load the two patterns, and the spectral value is calculated through differential measurement, which has good robustness to noise. The proposed method is verified by simulations and experiments compared with FSI based on two-, three-, and four-step phase shift algorithms. CFSI performed better than the other methods under the condition that the best imaging quality of CFSI is not reached. The reported technique provides an alternative approach to realize real-time and high-quality imaging.