https://github.com/cvl-umass/semi-inat-2020}
This document describes the details and the motivation behind a new dataset we collected for the semi-supervised recognition challenge~\cite{semi-aves} at the FGVC7 workshop at CVPR 2020. The dataset contains 1000 species of birds sampled from the iNat-2018 dataset for a total of nearly 150k images. From this collection, we sample a subset of classes and their labels, while adding the images from the remaining classes to the unlabeled set of images. The presence of out-of-domain data (novel classes), high class-imbalance, and fine-grained similarity between classes poses significant challenges for existing semi-supervised recognition techniques in the literature. The dataset is available here: \url{