In crowded scenes, detection and localization of abnormal behaviors is challenging in that high-density people make object segmentation and tracking extremely difficult. We associate the optical flows of multiple frames to capture short-term trajectories and introduce the histogram-based shape descriptor referred to as shape contexts to describe such short-term trajectories. Furthermore, we propose a K-NN similarity-based statistical model to detect anomalies over time and space, which is an unsupervised one-class learning algorithm requiring no clustering nor any prior assumption. Firstly, we retrieve the K-NN samples from the training set in regard to the testing sample, and then use the similarities between every pair of the K-NN samples to construct a Gaussian model. Finally, the probabilities of the similarities from the testing sample to the K-NN samples under the Gaussian model are calculated in the form of a joint probability. Abnormal events can be detected by judging whether the joint probability is below predefined thresholds in terms of time and space, separately. Such a scheme can adapt to the whole scene, since the probability computed as such is not affected by motion distortions arising from perspective distortion. We conduct experiments on real-world surveillance videos, and the results demonstrate that the proposed method can reliably detect and locate the abnormal events in the video sequences, outperforming the state-of-the-art approaches.