Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qiankun Tang

Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction

Feb 23, 2020

Beibei Jin, Yu Hu, Qiankun Tang, Jingyu Niu, Zhiping Shi, Yinhe Han, Xiaowei Li

Figure 1 for Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction

Figure 2 for Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction

Figure 3 for Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction

Figure 4 for Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction

Abstract:Video prediction is a pixel-wise dense prediction task to infer future frames based on past frames. Missing appearance details and motion blur are still two major problems for current predictive models, which lead to image distortion and temporal inconsistency. In this paper, we point out the necessity of exploring multi-frequency analysis to deal with the two problems. Inspired by the frequency band decomposition characteristic of Human Vision System (HVS), we propose a video prediction network based on multi-level wavelet analysis to deal with spatial and temporal information in a unified manner. Specifically, the multi-level spatial discrete wavelet transform decomposes each video frame into anisotropic sub-bands with multiple frequencies, helping to enrich structural information and reserve fine details. On the other hand, multi-level temporal discrete wavelet transform which operates on time axis decomposes the frame sequence into sub-band groups of different frequencies to accurately capture multi-frequency motions under a fixed frame rate. Extensive experiments on diverse datasets demonstrate that our model shows significant improvements on fidelity and temporal consistency over state-of-the-art works.

* Submitted to a conference, under review

Via

Access Paper or Ask Questions

PosNeg-Balanced Anchors with Aligned Features for Single-Shot Object Detection

Aug 09, 2019

Qiankun Tang, Shice Liu, Jie Li, Yu Hu

Figure 1 for PosNeg-Balanced Anchors with Aligned Features for Single-Shot Object Detection

Figure 2 for PosNeg-Balanced Anchors with Aligned Features for Single-Shot Object Detection

Figure 3 for PosNeg-Balanced Anchors with Aligned Features for Single-Shot Object Detection

Figure 4 for PosNeg-Balanced Anchors with Aligned Features for Single-Shot Object Detection

Abstract:We introduce a novel single-shot object detector to ease the imbalance of foreground-background class by suppressing the easy negatives while increasing the positives. To achieve this, we propose an Anchor Promotion Module (APM) which predicts the probability of each anchor as positive and adjusts their initial locations and shapes to promote both the quality and quantity of positive anchors. In addition, we design an efficient Feature Alignment Module (FAM) to extract aligned features for fitting the promoted anchors with the help of both the location and shape transformation information from the APM. We assemble the two proposed modules to the backbone of VGG-16 and ResNet-101 network with an encoder-decoder architecture. Extensive experiments on MS COCO well demonstrate our model performs competitively with alternative methods (40.0\% mAP on \textit{test-dev} set) and runs faster (28.6 \textit{fps}).

* Submitted to a conference, under review

Via

Access Paper or Ask Questions