Abstract:Video super-resolution plays an important role in surveillance video analysis and ultra-high-definition video display, which has drawn much attention in both the research and industrial communities. Although many deep learning-based VSR methods have been proposed, it is hard to directly compare these methods since the different loss functions and training datasets have a significant impact on the super-resolution results. In this work, we carefully study and compare three temporal modeling methods (2D CNN with early fusion, 3D CNN with slow fusion and Recurrent Neural Network) for video super-resolution. We also propose a novel Recurrent Residual Network (RRN) for efficient video super-resolution, where residual learning is utilized to stabilize the training of RNN and meanwhile to boost the super-resolution performance. Extensive experiments show that the proposed RRN is highly computational efficiency and produces temporal consistent VSR results with finer details than other temporal modeling methods. Besides, the proposed method achieves state-of-the-art results on several widely used benchmarks.
Abstract:Video-based person re-id has drawn much attention in recent years due to its prospective applications in video surveillance. Most existing methods concentrate on how to represent discriminative clip-level features. Moreover, clip-level data augmentation is also important, especially for temporal aggregation task. Inconsistent intra-clip augmentation will collapse inter-frame alignment, thus bringing in additional noise. To tackle the above-motioned problems, we design a novel framework for video-based person re-id, which consists of two main modules: Synchronized Transformation (ST) and Intra-clip Aggregation (ICA). The former module augments intra-clip frames with the same probability and the same operation, while the latter leverages two-level intra-clip encoding to generate more discriminative clip-level features. To confirm the advantage of synchronized transformation, we conduct ablation study with different synchronized transformation scheme. We also perform cross-dataset experiment to better understand the generality of our method. Extensive experiments on three benchmark datasets demonstrate that our framework outperforming the most of recent state-of-the-art methods.