Abstract:Salient object detection (SOD) aims to identify the most attractive objects within an image. Depending on the type of data being detected, SOD can be categorized into various forms, including RGB, RGB-D (Depth), RGB-T (Thermal) and light field SOD. Previous researches have focused on saliency detection with individual data type. If the RGB-D SOD model is forced to detect RGB-T data it will perform poorly. We propose an innovative model framework that provides a unified solution for the salient object detection task of three types of data (RGB, RGB-D, and RGB-T). The three types of data can be handled in one model (all in one) with the same weight parameters. In this framework, the three types of data are concatenated in an ordered manner within a single input batch, and features are extracted using a transformer network. Based on this framework, we propose an efficient lightweight SOD model, namely AiOSOD, which can detect any RGB, RGB-D, and RGB-T data with high speed (780FPS for RGB data, 485FPS for RGB-D or RGB-T data). Notably, with only 6.25M parameters, AiOSOD achieves excellent performance on RGB, RGB-D, and RGB-T datasets.
Abstract:Stereo images, containing left and right view images with disparity, are utilized in solving low-vision tasks recently, e.g., rain removal and super-resolution. Stereo image restoration methods usually obtain better performance than monocular methods by learning the disparity between dual views either implicitly or explicitly. However, existing stereo rain removal methods still cannot make full use of the complementary information between two views, and we find it is because: 1) the rain streaks have more complex distributions in directions and densities, which severely damage the complementary information and pose greater challenges; 2) the disparity estimation is not accurate enough due to the imperfect fusion mechanism for the features between two views. To overcome such limitations, we propose a new \underline{Stereo} \underline{I}mage \underline{R}ain \underline{R}emoval method (StereoIRR) via sufficient interaction between two views, which incorporates: 1) a new Dual-view Mutual Attention (DMA) mechanism which generates mutual attention maps by taking left and right views as key information for each other to facilitate cross-view feature fusion; 2) a long-range and cross-view interaction, which is constructed with basic blocks and dual-view mutual attention, can alleviate the adverse effect of rain on complementary information to help the features of stereo images to get long-range and cross-view interaction and fusion. Notably, StereoIRR outperforms other related monocular and stereo image rain removal methods on several datasets. Our codes and datasets will be released.
Abstract:Online selection of dynamic features has attracted intensive interest in recent years. However, existing online feature selection methods evaluate features individually and ignore the underlying structure of feature stream. For instance, in image analysis, features are generated in groups which represent color, texture and other visual information. Simply breaking the group structure in feature selection may degrade performance. Motivated by this fact, we formulate the problem as an online group feature selection. The problem assumes that features are generated individually but there are group structure in the feature stream. To the best of our knowledge, this is the first time that the correlation among feature stream has been considered in the online feature selection process. To solve this problem, we develop a novel online group feature selection method named OGFS. Our proposed approach consists of two stages: online intra-group selection and online inter-group selection. In the intra-group selection, we design a criterion based on spectral analysis to select discriminative features in each group. In the inter-group selection, we utilize a linear regression model to select an optimal subset. This two-stage procedure continues until there are no more features arriving or some predefined stopping conditions are met. %Our method has been applied Finally, we apply our method to multiple tasks including image classification %, face verification and face verification. Extensive empirical studies performed on real-world and benchmark data sets demonstrate that our method outperforms other state-of-the-art online feature selection %method methods.