Abstract:Salient object detection (SOD) in panoramic video is still in the initial exploration stage. The indirect application of 2D video SOD method to the detection of salient objects in panoramic video has many unmet challenges, such as low detection accuracy, high model complexity, and poor generalization performance. To overcome these hurdles, we design an Inter-Layer Attention (ILA) module, an Inter-Layer weight (ILW) module, and a Bi-Modal Attention (BMA) module. Based on these modules, we propose a Spatial-Temporal Dual-Mode Mixed Flow Network (STDMMF-Net) that exploits the spatial flow of panoramic video and the corresponding optical flow for SOD. First, the ILA module calculates the attention between adjacent level features of consecutive frames of panoramic video to improve the accuracy of extracting salient object features from the spatial flow. Then, the ILW module quantifies the salient object information contained in the features of each level to improve the fusion efficiency of the features of each level in the mixed flow. Finally, the BMA module improves the detection accuracy of STDMMF-Net. A large number of subjective and objective experimental results testify that the proposed method demonstrates better detection accuracy than the state-of-the-art (SOTA) methods. Moreover, the comprehensive performance of the proposed method is better in terms of memory required for model inference, testing time, complexity, and generalization performance.
Abstract:Superpixel-based Higher-order Conditional Random Fields (CRFs) are effective in enforcing long-range consistency in pixel-wise labeling problems, such as semantic segmentation. However, their major short coming is considerably longer time to learn higher-order potentials and extra hyperparameters and/or weights compared with pairwise models. This paper proposes a superpixel-enhanced pairwise CRF framework that consists of the conventional pairwise as well as our proposed superpixel-enhanced pairwise (SP-Pairwise) potentials. SP-Pairwise potentials incorporate the superpixel-based higher-order cues by conditioning on a segment filtered image and share the same set of parameters as the conventional pairwise potentials. Therefore, the proposed superpixel-enhanced pairwise CRF has a lower time complexity in parameter learning and at the same time it outperforms higher-order CRF in terms of inference accuracy. Moreover, the new scheme takes advantage of the pre-trained pairwise models by reusing their parameters and/or weights, which provides a significant accuracy boost on the basis of CRF-RNN even without training. Experiments on MSRC-21 and PASCAL VOC 2012 dataset confirm the effectiveness of our method.
Abstract:Superpixel-based Higher-order Conditional random fields (SP-HO-CRFs) are known for their effectiveness in enforcing both short and long spatial contiguity for pixelwise labelling in computer vision. However, their higher-order potentials are usually too complex to learn and often incur a high computational cost in performing inference. We propose an new approximation approach to SP-HO-CRFs that resolves these problems. Our approach is a multi-layer CRF framework that inherits the simplicity from pairwise CRFs by formulating both the higher-order and pairwise cues into the same pairwise potentials in the first layer. Essentially, this approach provides accuracy enhancement on the basis of pairwise CRFs without training by reusing their pre-trained parameters and/or weights. The proposed multi-layer approach performs especially well in delineating the boundary details (boarders) of object categories such as "trees" and "bushes". Multiple sets of experiments conducted on dataset MSRC-21 and PASCAL VOC 2012 validate the effectiveness and efficiency of the proposed methods.
Abstract:The sheer volume and size of histopathological images (e.g.,10^6 MPixel) underscores the need for faster and more accurate Regions-of-interest (ROI) detection algorithms. In this paper, we propose such an algorithm, which has four main components that help achieve greater accuracy and faster speed: First, while using coarse-to-fine topology preserving segmentation as the baseline, the proposed algorithm uses a superpixel regularity optimization scheme for avoiding irregular and extremely small superpixels. Second, the proposed technique employs a prediction strategy to focus only on important superpixels at finer image levels. Third, the algorithm reuses the information gained from the coarsest image level at other finer image levels. Both the second and the third components drastically lower the complexity. Fourth, the algorithm employs a highly effective parallelization scheme using adap- tive data partitioning, which gains high speedup. Experimental results, conducted on the BSD500 [1] and 500 whole-slide histological images from the National Lung Screening Trial (NLST)1 dataset, confirm that the proposed algorithm gained 13 times speedup compared with the baseline, and around 160 times compared with SLIC [11], without losing accuracy.