Abstract:The recent research for person re-identification has been focused on two trends. One is learning the part-based local features to form more informative feature descriptors. The other one is designing effective metric learning loss functions such as the Triplet loss family. We argue that learning global features with classification loss could achieve the same goal, even with a simple and cost-effective architecture design. We propose a person re-id framework featured by channel grouping and multi-branch strategy, which divides the global feature into multiple channel groups and learns the discriminative channel group features by multi-branch classification layers. In extensive experiments, our network outperforms state-of-the-art person re-id frameworks in terms of both accuracy and inference cost.
Abstract:Objects for detection usually have distinct characteristics in different sub-regions and different aspect ratios. However, in prevalent two-stage object detection methods, Region-of-Interest (RoI) features are extracted by RoI pooling with little emphasis on these translation-variant feature components. We present feature selective networks to reform the feature representations of RoIs by exploiting their disparities among sub-regions and aspect ratios. Our network produces the sub-region attention bank and aspect ratio attention bank for the whole image. The RoI-based sub-region attention map and aspect ratio attention map are selectively pooled from the banks, and then used to refine the original RoI features for RoI classification. Equipped with a light-weight detection subnetwork, our network gets a consistent boost in detection performance based on general ConvNet backbones (ResNet-101, GoogLeNet and VGG-16). Without bells and whistles, our detectors equipped with ResNet-101 achieve more than 3% mAP improvement compared to counterparts on PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO datasets.
Abstract:This paper proposes an efficient content adaptive screen image scaling scheme for the real-time screen applications like remote desktop and screen sharing. In the proposed screen scaling scheme, a screen content classification step is first introduced to classify the screen image into text and pictorial regions. Afterward, we propose an adaptive shift linear interpolation algorithm to predict the new pixel values with the shift offset adapted to the content type of each pixel. The shift offset for each screen content type is offline optimized by minimizing the theoretical interpolation error based on the training samples respectively. The proposed content adaptive screen image scaling scheme can achieve good visual quality and also keep the low complexity for real-time applications.