Abstract:In this paper, we propose a deep invertible hybrid model which integrates discriminative and generative learning at a latent space level for semi-supervised few-shot classification. Various tasks for classifying new species from image data can be modeled as a semi-supervised few-shot classification, which assumes a labeled and unlabeled training examples and a small support set of the target classes. Predicting target classes with a few support examples per class makes the learning task difficult for existing semi-supervised classification methods, including selftraining, which iteratively estimates class labels of unlabeled training examples to learn a classifier for the training classes. To exploit unlabeled training examples effectively, we adopt as the objective function the composite likelihood, which integrates discriminative and generative learning and suits better with deep neural networks than the parameter coupling prior, the other popular integrated learning approach. In our proposed model, the discriminative and generative models are respectively Prototypical Networks, which have shown excellent performance in various kinds of few-shot learning, and Normalizing Flow a deep invertible model which returns the exact marginal likelihood unlike the other three major methods, i.e., VAE, GAN, and autoregressive model. Our main originality lies in our integration of these components at a latent space level, which is effective in preventing overfitting. Experiments using mini-ImageNet and VGG-Face datasets show that our method outperforms selftraining based Prototypical Networks.
Abstract:Describing the color and textural information of a person image is one of the most crucial aspects of person re-identification (re-id). In this paper, we present novel meta-descriptors based on a hierarchical distribution of pixel features. Although hierarchical covariance descriptors have been successfully applied to image classification, the mean information of pixel features, which is absent from the covariance, tends to be the major discriminative information for person re-id. To solve this problem, we describe a local region in an image via hierarchical Gaussian distribution in which both means and covariances are included in their parameters. More specifically, the region is modeled as a set of multiple Gaussian distributions in which each Gaussian represents the appearance of a local patch. The characteristics of the set of Gaussians are again described by another Gaussian distribution. In both steps, we embed the parameters of the Gaussian into a point of Symmetric Positive Definite (SPD) matrix manifold. By changing the way to handle mean information in this embedding, we develop two hierarchical Gaussian descriptors. Additionally, we develop feature norm normalization methods with the ability to alleviate the biased trends that exist on the descriptors. The experimental results conducted on five public datasets indicate that the proposed descriptors achieve remarkably high performance on person re-id.