Abstract:Various Deep Neural Network architectures are keeping massive vital records in computer vision. While drawing attention worldwide, the design of the overall structure somehow lacks general guidance. Based on the relationship between DNN design with numerical differential equations, which several researchers observed in recent years, we perform a fair comparison of residual design with higher-order perspectives. We show that the widely used DNN design strategy, constantly stacking a small design, could be easily improved, supported by solid theoretical knowledge and no extra parameters needed. We reorganize the residual design in higher-order ways, which is inspired by the observation that many effective networks could be interpreted as different numerical discretizations of differential equations. The design of ResNet follows a relatively simple scheme which is Euler forward; however, the situation is getting complicated rapidly while stacking. We suppose stacked ResNet is somehow equalled to a higher order scheme, then the current way of forwarding propagation might be relatively weak compared with a typical high-order method like Runge-Kutta. We propose higher order ResNet to verify the hypothesis on widely used CV benchmarks with sufficient experiments. Stable and noticeable rises in performance are observed, convergence and robustness are benefited.
Abstract:Stereo matching is the key step in estimating depth from two or more images. Recently, some tree-based non-local stereo matching methods have been proposed, which achieved state-of-the-art performance. The algorithms employed some tree structures to aggregate cost and thus improved the performance and reduced the coputation load of the stereo matching. However, the computational complexity of these tree-based algorithms is still high because they search over the entire disparity range. In addition, the extreme greediness of the minimum spanning tree (MST) causes the poor performance in large areas with similar colors but varying disparities. In this paper, we propose an efficient stereo matching method using a hierarchical disparity prediction (HDP) framework to dramatically reduce the disparity search range so as to speed up the tree-based non-local stereo methods. Our disparity prediction scheme works on a graph pyramid derived from an image whose disparity to be estimated. We utilize the disparity of a upper graph to predict a small disparity range for the lower graph. Some independent disparity trees (DT) are generated to form a disparity prediction forest (HDPF) over which the cost aggregation is made. When combined with the state-of-the-art tree-based methods, our scheme not only dramatically speeds up the original methods but also improves their performance by alleviating the second drawback of the tree-based methods. This is partially because our DTs overcome the extreme greediness of the MST. Extensive experimental results on some benchmark datasets demonstrate the effectiveness and efficiency of our framework. For example, the segment-tree based stereo matching becomes about 25.57 times faster and 2.2% more accurate over the Middlebury 2006 full-size dataset.