Abstract:In this paper, we present a method for instance ranking and retrieval at fine-grained level based on the global features extracted from a multi-attribute recognition model which is not dependent on landmarks information or part-based annotations. Further, we make this architecture suitable for mobile-device application by adopting the bilinear CNN to make the multi-attribute recognition model smaller (in terms of the number of parameters). The experiments run on the Dress category of DeepFashion In-Shop Clothes Retrieval and CUB200 datasets show that the results of instance retrieval at fine-grained level are promising for these datasets, specially in terms of texture and color.
Abstract:In this paper, we address the extraction of the fine-grained attributes of an instance as a `multi-attribute classification' problem. To this end, we propose an end-to-end architecture by adopting the bi-linear Convolutional Neural Network with the pairwise ranking loss. This is the first time such architecture is applied for the fine-grained attributes classification problem. We compared the proposed method with a competitive deep Convolutional Neural Network baseline. Extensive experiments show that the proposed method attains/outperforms the performance of compared baseline with significantly less number of parameters ($40\times$ less). We demonstrated our approach on CUB200 birds dataset whose annotations are adapted in this work for multi-attribute classification at fine-grained level.
Abstract:DeepFashion is a widely used clothing dataset with 50 categories and more than overall 200k images where each image is annotated with fine-grained attributes. This dataset is often used for clothes recognition and although it provides comprehensive annotations, the attributes distribution is unbalanced and repetitive specially for training fine-grained attribute recognition models. In this work, we tailored DeepFashion for fine-grained attribute recognition task by focusing on each category separately. After selecting categories with sufficient number of images for training, we remove very scarce attributes and merge the duplicate ones in each category, then we clean the dataset based on the new list of attributes. We use a bilinear convolutional neural network with pairwise ranking loss function for multi-label fine-grained attribute recognition and show that the new annotations improve the results for such a task. The detailed annotations for each of the selected categories are provided for public use.