An increasing number of systems are being designed by first gathering significant amounts of data, and then optimizing the system parameters directly using the obtained data. Often this is done without analyzing the dataset structure. As task complexity, data size, and parameters all increase to millions or even billions, data summarization is becoming a major challenge. In this work, we investigate data summarization via dictionary learning, leveraging the properties of recently introduced non-negative kernel regression (NNK) graphs. Our proposed NNK-Means, unlike competing techniques, such askSVD, learns geometric dictionaries with atoms that lie in the input data space. Experiments show that summaries using NNK-Meanscan provide better discrimination compared to linear and kernel versions of kMeans and kSVD. Moreover, NNK-Means has a scalable implementation, with runtime complexity similar to that of kMeans.