Abstract:We present a generic and flexible module that encodes region proposals by both their intrinsic features and the extrinsic correlations to the others. The proposed non-local region of interest (NL-RoI) can be seamlessly adapted into different generalized R-CNN architectures to better address various perception tasks. Observe that existing techniques from R-CNN treat RoIs independently and perform the prediction solely based on image features within each region proposal. However, the pairwise relationships between proposals could further provide useful information for detection and segmentation. NL-RoI is thus formulated to enrich each RoI representation with the information from all other RoIs, and yield a simple, low-cost, yet effective module for region-based convolutional networks. Our experimental results show that NL-RoI can improve the performance of Faster/Mask R-CNN for object detection and instance segmentation.
Abstract:We introduce the concept of Non-Local RoI (NL-RoI) Block as a generic and flexible module that can be seamlessly adapted into different Mask R-CNN heads for various tasks. Mask R-CNN treats RoIs (Regions of Interest) independently and performs the prediction based on individual object bounding boxes. However, the correlation between objects may provide useful information for detection and segmentation. The proposed NL-RoI Block enables each RoI to refer to all other RoIs' information, and results in a simple, low-cost but effective module. Our experimental results show that generalizations with NL-RoI Blocks can improve the performance of Mask R-CNN for instance segmentation on the Robust Vision Challenge benchmarks.