Abstract:Long-tailed data is a special type of multi-class imbalanced data with a very large amount of minority/tail classes that have a very significant combined influence. Long-tailed learning aims to build high-performance models on datasets with long-tailed distributions, which can identify all the classes with high accuracy, in particular the minority/tail classes. It is a cutting-edge research direction that has attracted a remarkable amount of research effort in the past few years. In this paper, we present a comprehensive survey of latest advances in long-tailed visual learning. We first propose a new taxonomy for long-tailed learning, which consists of eight different dimensions, including data balancing, neural architecture, feature enrichment, logits adjustment, loss function, bells and whistles, network optimization, and post hoc processing techniques. Based on our proposed taxonomy, we present a systematic review of long-tailed learning methods, discussing their commonalities and alignable differences. We also analyze the differences between imbalance learning and long-tailed learning approaches. Finally, we discuss prospects and future directions in this field.
Abstract:Real-world datasets often present different degrees of imbalanced (i.e., long-tailed or skewed) distributions. While the majority (a.k.a., head or frequent) classes have sufficient samples, the minority (a.k.a., tail or rare) classes can be under-represented by a rather limited number of samples. On one hand, data resampling is a common approach to tackling class imbalance. On the other hand, dimension reduction, which reduces the feature space, is a conventional machine learning technique for building stronger classification models on a dataset. However, the possible synergy between feature selection and data resampling for high-performance imbalance classification has rarely been investigated before. To address this issue, this paper carries out a comprehensive empirical study on the joint influence of feature selection and resampling on two-class imbalance classification. Specifically, we study the performance of two opposite pipelines for imbalance classification, i.e., applying feature selection before or after data resampling. We conduct a large amount of experiments (a total of 9225 experiments) on 52 publicly available datasets, using 9 feature selection methods, 6 resampling approaches for class imbalance learning, and 3 well-known classification algorithms. Experimental results show that there is no constant winner between the two pipelines, thus both of them should be considered to derive the best performing model for imbalance classification. We also find that the performance of an imbalance classification model depends on the classifier adopted, the ratio between the number of majority and minority samples (IR), as well as on the ratio between the number of samples and features (SFR). Overall, this study should provide new reference value for researchers and practitioners in imbalance learning.
Abstract:In this paper, we introduce the ShopSign dataset, which is a newly developed natural scene text dataset of Chinese shop signs in street views. Although a few scene text datasets are already publicly available (e.g. ICDAR2015, COCO-Text), there are few images in these datasets that contain Chinese texts/characters. Hence, we collect and annotate the ShopSign dataset to advance research in Chinese scene text detection and recognition. The new dataset has three distinctive characteristics: (1) large-scale: it contains 25,362 Chinese shop sign images, with a total number of 196,010 text-lines. (2) diversity: the images in ShopSign were captured in different scenes, from downtown to developing regions, using more than 50 different mobile phones. (3) difficulty: the dataset is very sparse and imbalanced. It also includes five categories of hard images (mirror, wooden, deformed, exposed and obscure). To illustrate the challenges in ShopSign, we run baseline experiments using state-of-the-art scene text detection methods (including CTPN, TextBoxes++ and EAST), and cross-dataset validation to compare their corresponding performance on the related datasets such as CTW, RCTW and ICPR 2018 MTWI challenge dataset. The sample images and detailed descriptions of our ShopSign dataset are publicly available at: https://github.com/chongshengzhang/shopsign.