Abstract:We propose a neural network component, the regional aggregation layer, that makes it possible to train a pixel-level density estimator using only coarse-grained density aggregates, which reflect the number of objects in an image region. Our approach is simple to use and does not require domain-specific assumptions about the nature of the density function. We evaluate our approach on several synthetic datasets. In addition, we use this approach to learn to estimate high-resolution population and housing density from satellite imagery. In all cases, we find that our approach results in better density estimates than a commonly used baseline. We also show how our housing density estimator can be used to classify buildings as residential or non-residential.
Abstract:We propose a method to classify images from target classes with a small number of training examples based on transfer learning from non-target classes. Without using any more information than class labels for samples from non-target classes, we train a Siamese net to estimate the probability of two images to belong to the same class. With some post-processing, output of the Siamese net can be used to form a gram matrix of a Mercer kernel. Coupled with a support vector machine (SVM), such a kernel gave reasonable classification accuracy on target classes without any fine-tuning. When the Siamese net was only partially fine-tuned using a small number of samples from the target classes, the resulting classifier outperformed the state-of-the-art and other alternatives. We share class separation capabilities and insights into the learning process of such a kernel on MNIST, Dogs vs. Cats, and CIFAR-10 datasets.