Abstract:Visual appearance-based person retrieval is a challenging problem in surveillance. It uses attributes like height, cloth color, cloth type and gender to describe a human. Such attributes are known as soft biometrics. This paper proposes person retrieval from surveillance video using height, torso cloth type, torso cloth color and gender. The approach introduces an adaptive torso patch extraction and bounding box regression to improve the retrieval. The algorithm uses fine-tuned Mask R-CNN and DenseNet-169 for person detection and attribute classification respectively. The performance is analyzed on AVSS 2018 challenge II dataset and it achieves 11.35% improvement over state-of-the-art based on average Intersection over Union measure.