This review article surveys the current progresses made toward video-based anomaly detection. We address the most fundamental aspect for video anomaly detection, that is, video feature representation. Much research works have been done in finding the right representation to perform anomaly detection in video streams accurately with an acceptable false alarm rate. However, this is very challenging due to large variations in environment and human movement, and high space-time complexity due to huge dimensionality of video data. The weakly supervised nature of deep learning algorithms can help in learning representations from the video data itself instead of manually designing the right feature for specific scenes. In this paper, we would like to review the existing methods of modeling video representations using deep learning techniques for the task of anomaly detection and action recognition.